Showing posts with label Bayesian. Show all posts
Showing posts with label Bayesian. Show all posts

Dec 3, 2012

Solvency II or Basel III ? Model Fallacy

Managing investment models - ALM models in particular - is a professional art. One of the most tricky risk management fallacies when dealing with these models, is that they are being used for identifying so called 'bad scenarios', which are then being 'hedged away'.

To illustrate what is happening, join me in a short every day ALM thought experiment...

Before that, I must warn you... this is going to be a long, technical, but hopefully interesting Blog. I'll try to keep the discussion on 'high school level'. Stay with me, Ill promise: it actuarially pays out in the end!

ALM Thought Experiment
  • Testing the asset Mix
    Suppose the board of our Insurance Company or Pension Fund is testing the current strategic asset mix with help of an ALM model in order to find out more about the future risk characteristics of the chosen portfolio.
     
  • Simulation
    The ALM model runs a 'thousands of scenarios simulation', to find out under which conditions and in which scenarios the 'required return' is met and to test if results are in line with the defined risk appetite.
     
  • Quantum Asset Return Space
    In order to stay as close to reality as possible, let's assume that the 'Quantum Asset Return Space' in which the asset mix has to deliver its required returns for a fixed chosen duration horizon N, consists of: 
    1. 999,900 scenarios with Positive Outcomes ( POs ),
      where the asset returns weigh up to the required return) and 
    2. 100 scenarios with Negative Outcomes ( NOs ),
      where the asset returns fail to weigh up to the required return.
       
    Choose 'N' virtual anywhere between 1 (fraction) of a year up to 50 years, in line with your liability duration.
     

  • Confidence (Base) Rate
    From the above example, we may conclude that the N-year confidence base rate of a positive scenario outcome (in short: assets meet liabilities) in reality is 99.99% and the N-year probability of a company default due to a lack of asset returns in reality is 0.01%.
     
  • From Quantum Space to Reality
    As the strategic asset mix 'performs' in a quantum reality, nobody - no board member or expert - can tell which of the quantum ('potential') scenarios will come true in the next N years or (even) what the exact individual quantum scenarios are.

    Nevertheless, these quantum scenarios all exist in "Quantum Asset Return Space" (QARS) and only one of those quantum scenarios will finally turn out as the one and only 'return reality'.

    Which one...(?), we can only tell after the scenario has manifested itself after N years.
     
  • Defining the ALM Model
    Now we start defining our ALM Model. As any model, our ALM model is an approach of reality (or more specific: the above defined 'quantum reality') in which we are forced to make simplifications, like: defining an 'average return', defining 'risk' as standard deviation, defining a 'normal' or other type of model as basis for drawing 'scenarios' for our ALM's simulation process.
    Therefore our ALM Model is and cannot be perfect.

    Now, because of the fact that our model isn't perfect, let's assume that our 'high quality' ALM Model has an overall Error Rate of 1% (ER=1%), more specific simplified defined as:
    1. The model generates Negative Scenario Outcomes (NSOs) (= required return not met) with an error rate of 1%. In other words: in 1% of the cases, the model generates a positive outcome scenario when it should have generated a negative outcome scenario
       
    2. The model generates Positive Scenario Outcomes (PSOs) (= required return met) with an error rate of 1%. In other words: in 1% of the cases, the model generates a negative outcome scenario when it should have generated a positive outcome scenario
       

The Key Question!
Now that we've set the our ALM model, we run it in a simulation with no matter how much runs. Here is the visual outcome:


As you may notice, the resulting ALM graph tells us more than a billion numbers....At once it's clear that one of the scenarios (the blue one) has a very negative unwanted outcome.
The investment advisor suggests to 'hedge this scenario away'. You as an actuary raise the key question:

What is the probability that a Negative Outcome (NO) scenario in the ALM model is indeed truly a negative outcome and not a false outcome due to the fact that the model is not perfect?

With this question, you hit the nail (right) on the head...
Do you know the answer? Is it 99% exactly, more or less?

Before reading further, try to answer the question and do not cheat by scrolling down.....

To help you prevent reading further by accident, I have inserted a pointful youtube movie:



Answer 
Now here is the answer: The probability that any of the NOs (Negative Outcomes) in the ALM study - and not only the very negative blue one - is a truly a NO and not a PO (Positive Outcome) and therefore false NO, is - fasten your seat belts  - 0.98%! (no misspelling here!)

Warning
So there's a 99.02% (=100%-0.98%) probability that any Negative Outcome from our model is totally wrong, Therefore one must be very cautious and careful with drawing conclusions and formulating risk management actions upon negative scenarios from ALM models in general.

Explanation
Here's the short Excel-like explanation, which is based on Bayes' Theorem.
You can download the Excel spreadsheet here.


There is MORE!
Now you might argue that the low probability (0.98%) of finding true Negative Outcomes is due to the high (99,99%) Positive Outcome rate and that 99,99% is unrealistic much higher than - for instance - the Basel III confidence level of 99,9%. Well..., you're absolutely right. As high positive outcome rates correspond one to one with high confidence levels, here are the results for other positive outcome rates that equal certain well known (future) standard confidence levels (N := 1 year):


What can we conclude from this graph?
If the relative part of positive outcomes and therefore the confidence levels rise, the probability that an identified Negative Output Scenario is true, decreases dramatically fast to zero. To put it in other words:

At high confidence levels (ALM) models can not identify negative scenarios anymore!!!


Higher Error Rates
Now keep in mind we calculated all this still with a high quality error rate of 1%. What about higher model error rates. Here's the outcome:


As expected, at higher error rates, the situation of non detectable negative scenarios gets worse as the model error rate increases......

U.S. Pension Funds
The 50% Confidence Level is added, because a lot of U.S. pension funds are in this confidence area. In this case we find - more or less regardless of the model error rate level - a substantial probability ( 80% - 90%) of finding true negative outcome scenarios. Problem here is, it's useless to define actions on individual negative scenarios. First priority should be to restructure and cut ambition in the current pension agreement, in order to realize a higher confidence level. It's useless to mop the kitchen when your house is flooded with water.....

Model Error Rate Determination
One might argue that the approach in this blog is too theoretical as it's impossible to determine the exact (future) error rate of a model. Yes, it's true that the exact model error rate is hard to determine. However, with help of backtesting the magnitude of the model error rate can be roughly estimated and that's good enough for drawing relevant conclusions.

A General Detectability Equation
The general equation for calculating the Detectability (Rate) of Negative Outcome Scenarios (DNOS) given the model error rate (ER)  and a trusted Confidence Level (CL) is:

DNOS = (1-ER) (1- CL) / ( 1- CL + 2 ER CL -ER )

Example
So a model error rate of 1%, combined with Basel III confidence level of 99.9% results in a low 9.02% [ =(1-0.01)*(1-0.999)/(1-0.999+2*0.01*0.999-0.01) ] detectability of Negative Outcome scenarios.

Detectability Rates
Here's a more complete oversight of detectability rates:


It would take (impossible?) super high quality model error rates of 0.1% or lower to regain detectability power in our (ALM) models, as is shown in the next table:



Required  Model Confidence Level
If we define the Model Confidence Level as MCL = 1 - MER, the rate of Detectability of Negative Outcome Scenarios as DR= Detectability Rate = DNOS and the CL as CL=Positive Outcome Scenarios' Confidence Level, we can calculate an visualize the required  Model Confidence Levels (MCL) as follows:

From this graph it's at a glance clear that already modest Confidence Levels (>90%) in combination with a modest Detectability Rate of 90%, leads to unrealistic required Model Confidence Rates of around 99% or more. Let's not discuss the required Model Confidence Rates for Solvency II and/or Basel II/III.

Conclusions
  1. Current models lose power
    Due to the effect that (ALM) models are limited (model error rates 1%-5%) and confidence levels are increasing (above > 99%) because of more severe regulation, models significantly lose power an therefore become useless in detecting true negative outcome scenarios in a simulation. This implies that models lose their significance with respect to adequate risk management, because it's impossible to detect whether any negative outcome scenario is realistic.
     
  2. Current models not Solvency II and Basel II/III proof
    From (1) we can conclude in general that - despite our sound methods -our models probably are not Solvency II and Basel II/III proof. First action to take, is to get sight on the error rate of our models in high confidence environments...
     
  3. New models?
    The alternative and challenge for actuaries and investment modelers is to develop new models with substantial lower model error rates (< 0.1%).

    Key Question: Is that possible?

    If you are inclined to think it is, please keep in mind that human beings have an error rate of 1% and computer programs have an error rate of about 3%.......
     

Links & Sources:

Sep 25, 2011

Compliance: Sample Size

How to set an adequate sample size in case of a compliance check?

This simple question has ultimately a simple answer, but can become a "mer à boire" (nightmare) in case of a 'classic' sample size approach.....

In my last-but-one blog called 'Pisa or Actuarial Compliant?', I already stressed the importance of checking compliance in the actuarial work field.

Not only from a actuarial perspective compliance is important, but also from a core business viewpoint:

Compliance is the main key driver for sustainable business

Minimizing Total Cost by Compliance
A short illustration: We all know that compliance cost are a part of Quality Control Cost (QC Cost) and that the cost of NonCompliance (NC Cost) increase with the noncompliance rate. 

Mainly 'NC cost' relate to:
  • Penalties or administrative fines of the (legal) regulators
  • Extra  cost of complaint handling
  • Client claims
  • Extra administrative cost 
  • Cost of legal procedures

Sampling costs - on their turn -  are a (substantial) part of QC cost.

More in general now it's the art of  good practice compliance management, to determine that level of maximal noncompliance rate, that minimizes the total cost of a company.



Although this approach is more or less standard, in practice companies revenues depend strongly on the level of compliance. In other words: If compliance increases, revenues increase and variable costs decrease.

This implies that introducing 'cost driven compliance management' - in general - will (1) reduce  the total cost and (2) mostly make room for additional investments in 'QC Cost' to improve compliance and to lower variable and total cost.

In practice you'll probably have to calibrate (together with other QC investment costs) to find the optimal cost (investment) level that minimizes the total cost as a percentage of the revenues.


As is clear, modeling this kind of stuff is no work for amateurs. It's real risk management crafts-work. After all, the effect of cost investments is not sure and depends on all kind o probabilities and circumstances that need to be carefully modeled and calibrated.

From this more meta perspective view, let's descend to the next down to earth 'real life example'.

'Compliance Check' Example
As you probably know, pension advisors have to be compliant and  meet strict federal, state and local regulations.

On behave of the employee, the sponsoring employer as well as the insurer or pension fund, all have a strong interest that the involved 'Pension Advisor' actually is, acts and remains compliant.

PensionAdvice
A professional local Pension Advisor firm, 'PensionAdvice' (fictitious name), wants 'compliance' to become a 'calling card' for  their company. Target is that 'compliance' will become a competitive advantage over its rivals.

You, as an actuary, are asked to advise on the issue of how to verify PensionAdvice's compliance....... What to do?


  • Step 1 : Compliance Definition
    First you ask the board of PensionAdvice  what compliance means.
    After several discussions compliance is in short defined as:

    1. Compliance Quality
      Meeting the regulator's (12 step)  legal compliance requirements
      ('Quality Advice Second Pillar Pension')

    2. Compliance Quantity
      A 100% compliance target of PensionAdvice's portfolio, with a 5% non-compliance rate (error rate) as a maximum on basis of a 95% confidence level.

    The board has no idea about the (f)actual level of compliance. Compliance was- until now - not addressed on a more detailed employer dossier level.
    Therefore you decide to start with a simple sample approach.

  • Step 2 : Define Sample Size
    In order to define the right sample size, portfolio size is important.
    After a quick call PensionAdvice gives you a rough estimate of their portfolio: around 2.500 employer pension dossiers.

    You pick up your 'sample table spreadsheet' and are confronted with the first serious issue.
    An adequate sample (95% confidence level) would urge a minimum of 334 samples. With around 10-20 hours research per dossiers, the costs of this size of this sampling project would get way out of hand and become unacceptable as they would raise the total cost of  PensionAdvice (check this, before you conclude so!).

    Lowering confidence level doesn't solve the problem either. Sample sizes of 100 and more are still too costly and confidence levels of less than 95% are of no value in relation to the clients ambition (compliance= calling card).
    The same goes for higher - more than 5% - 'Error Tolerance' .....

    By the way, in case of samples for small populations things will not turn out better. To achieve relevant confidence levels (>95%) and error tolerances (<5%), samples must have a substantial size in relation to the population size.


    You can check all this out 'live', on the next spreadsheet. Use the 'Click to Edit' button to change the next spreadsheet to modify sampling conditions to your own needs. If you don't know the variability of the population, use a 'safe' variability of 50%. Click 'Sample Size II' for modeling the sample size of PensionAdvice.



  • Step 3: Use Bayesian Sample Model
    The above standard approach of sampling could deliver smaller samples if we would be sure of a low variability.

    Unfortunately we (often) do not know the variability upfront.

    Here comes the help of a method based on efficient sampling and Bayesian statistics, as clearly described by Matthew Leitch.

    A more simplified version of Leitch's approach is based on the Laplace's famous  'Rule of succession', a classic application of the beta distribution ( Technical explanation (click) ).

    The interesting aspects of this method are:
    1. Prior (weak or small) samples or beliefs about the true error rate and confidence levels, can be added in the model in the form of an (artificial) additional (pre)sample.

    2. As the sample size increases, it becomes clear whether  the defined confidence level will be met or not and if adding more samples is appropriate and/or cost effective.
  • This way unnecessary samples are avoided, sampling becomes as cost effective as possible and auditor and client can dynamically develop a grip on the distribution. Enough talk, let's demonstrate how this works.

Sample Demonstration
The next sample is contained in an Excel spreadsheet that you can download and that is presented in a simplified Zoho spreadsheet at the end of this blog. You can modify this spreadsheet (on line !) to your own needs and use it for real life compliance sampling. Use it with care in case of small populations (n<100).

A. Check on the prior believes of management
Management estimates the actual NonCompliance rate at 8% with 90% confidence that the actual NonCompliance rate is 8% or less:



If management would have no idea at all, or if you would not (like to) include management opinion, simply estimate both (NonCompliance rate and confidence) at 50% (= indifferent) in your model.

B. Define Management Objectives
After some discussion, management defines the (target) Maximum acceptable NonCompliance rate at 5% with a 95% confidence level (=CL)



C. Start ampling
Before you start sampling, please notice how prior believes of management are rendered into a fictitious sample (test number = 0) in the model:
  • In this case prior believes match a fictitious sample of size 27 with zero noncompliance observations. 
  • This fictitious sample corresponds to a confidence level of 76% on basis of a maximum (population) noncompliance rate of 5%.
[ If you think the rendering is to optimistic, you can change the fictitious number of noncompliance observations from zero into 1, 2 or another number (examine in the spreadsheet what happens and play around).]

To lift the 76% confidence level to 95%, it would take an additional sample size of 31 with zero noncompliance outcomes (you can check this in the spreadsheet).
As sampling is expensive, your employee Jos runs a first test (test 1) with a sample size of 10 with zero noncompliance outcomes. This looks promising!
The cumulative confidence level has risen from 76% to over 85%.



You decide to take another limited sample with a sample size of 10. Unfortunately this sample contains one noncompliant outcome. As a result, the cumulative confidence level drops to almost 70% and another sample of size 45 with zero noncompliant outcomes is necessary to reach the desired 95% confidence level.

You decide to go on and after a few other tests you finally arrive at the intended 95%cumulative confidence level. Mission succeeded!



The great advantage of this incremental sampling method is that if noncompliance shows up in an early stage, you can
  • stop sampling, without having made major sampling cost
  • Improve compliance of the population by means of additional measures on basis of the learnings from the noncompliant outcomes
  • start sampling again (from the start) 

If - for example -  test 1 would have had 3 noncompliant outcomes instead of zero, it would take an additional test of size 115 with zero noncompliant outcomes tot achieve a 95% confidence level.  It's clear that in this case it's better to first learn from the 3 noncompliant outomes, what's wrong or needs improvement, than to go on with expensive sampling against your better judgment.



D. Conclusions
On basis of a prior believe that - with 90% confidence - the population is  8% noncompliant, we can now conclude that after an additional total sample of size 65, PensionAdvice's noncompliance rate is 5% or less with a 95% confidence level.

If we want to be 95% sure without 'prior believe', we'll have to take an additional sample of size 27 with zero noncompliant outcomes as a result.

E. Check out

Check out 'on line', the next spreadsheet. Use the 'Click to Edit' button to change the sample spreadsheet to modify sampling conditions to your own needs or download the Excel spreadsheet.



Finally
Excuses for this much too long blog. I hope I've succeeded in keeping your attention....


Related links / Resources

I. Download official Maggid Excel spreadsheets:
- Dynamic Compliance Sampling (2011)
- Small Sample Size Calculator

II. Related links/ Sources:
- 'Efficient Sampling' spreadsheet by Matthew Leitch
- What Is The Right Sample Size For A Survey?
- Sample Size
- Epidemiology
- Probability of adverse events that have not yet occurred
- Progressive Sampling (Pdf)
- The True Cost of Compliance
- Bayesian modeling (ppt)

May 10, 2011

Homo Actuarius Bayesianis

Bayesian fallacies are often the most trickiest.....

A classical example of a Bayesian fallacy is the so called "Prosecutor's fallacy" in case of DNA testing...

Multiple DNA testing (Source: Wikipedia)
A crime-scene DNA sample is compared against a database of 20,000 men.

A match is found, the corresponding man is accused and at his trial, it is testified that the probability that two DNA profiles match by chance is only 1 in 10,000.


Sounds logical, doesn't it?
Yes... 'Sounds'... As this does not mean the probability that the suspect is innocent is also 1 in 10,000. Since 20,000 men were tested, there were 20,000 opportunities to find a match by chance.

Even if none of the men in the database left the crime-scene DNA, a match by chance to an innocent is more likely than not. The chance of getting at least one match among the records is in this case:



So, this evidence alone is an uncompelling data dredging result. If the culprit was in the database then he and one or more other men would probably be matched; in either case, it would be a fallacy to ignore the number of records searched when weighing the evidence. "Cold hits" like this on DNA data-banks are now understood to require careful presentation as trial evidence.

In a similar (Dutch) case, an innocent nurse (Lucia de Berk) was at first wrongly accused (and convicted!) of murdering several of her patients.

Other Bayesian fallacies
Bayesian fallacies can come close to the actuarial profession and even be humorous, as the next two examples show:
  1. Pension Fund Management
    It turns out that from all pension board members that were involved in a pension fund deficit, only 25% invested more than half in stocks.

    Therefore 75% of the pension fund board members with a pension fund deficit invested 50% or less in stocks.


    From this we may conclude that pension fund board members should have done en do better by investing more in stocks....

  2. The Drunken Driver
    It turns out that of from all drivers involved in car crashes 41% were drunk and 59% sober.

    Therefore to limit the probability of a car crash it's better to drink...


It's often not easy to recognize the 'Bayesian Monster' in your models. If you doubt, always set up a 2 by 2 contingency table to check the conclusions....


Homo Actuarius
Let's  dive into the historical development of Asset Liability Management (ALM) to illustrate the different stages we as actuaries went through to finally cope with Bayesian stats. We do this by going (far) back to prehistoric actuarial times.
 

As we all know, the word actuary originated from the Latin word actuarius (the person who occupied this position kept the minutes at the sessions of the Senate in the Ancient Rome). This explains part of the name-giving of our species.

Going back further in time we recognize the following species of actuaries..

  1. Homo Actuarius Apriorius
    This actuarial creature (we could hardly call him an actuary) establishes the probability of an hypothesis, no matter what data tell.

    ALM example: H0: E(return)=4.0%. Contributions, liabilities and investments are all calculated at 4%. What the data tell is uninteresting.

  2. Homo Actuarius Pragmaticus
    The more developed 'Homo Actuarius Pragamiticus' demonstrates he's only interested in the (results of the) data.
    ALM example: In my experiments I found x=4.0%, full stop.
    Therefore, let's calculate with this 4.0%.

  3. Homo Actuarius Frequentistus
    In this stage, the 'Homo Actuarius Frequentistus' measures the probability of the data given a certain hypothesis.

    ALM example: If H0: E(return)=4.0%, then the probability to get an observed value more different from the one I observed is given by an opportune expression. Don't ask myself if my observed value is near the true one, I can only tell you that if my observed value(s) is the true one, then the probability of observing data more extreme than mine is given by an opportune expression.
    In this stage the so called Monte Carlo Methods was developed...

  4. Homo Actuarius Contemplatus
    The Homo Actuarius Contemplatus measures the probability of the data and of the hypothesis.

    ALM example
    :You decide to take over the (divided!) yearly advice of the 'Parameters Committee' to base your ALM on the maximum expected value for the return on fixed-income securities, which is at that moment  4.0%. Every year you measure the (deviation) of the real data as well and start contemplating on how the two might match...... (btw: they don't!)

  5. Homo Actuarius Bayesianis
    The Homo Actuarius Bayesianis measures the probability of the hypothesis, given the data.  Was the  Frequentistus'  approach about 'modeling mechanisms' in the world, the Bayesian interpretations are more about 'modeling rational reasoning'.

    ALM example: Given the data of a certain period we test wetter the value of H0: E(return)=4.0% is true : near 4.0% with a P% (P=99?) confidence level.


Knowledge: All probabilities are conditional
Knowledge is a strange  phenomenon...

When I was born I knew nothing about everything.
When I grew up learned something about some thing.
Now I've grown old I know everything about nothing.


Joshua Maggid


The moment we become aware that ALL probabilities - even quantum probabilities - are in fact hidden conditional Bayesian probabilities, we (as actuaries) get enlightened (if you don't : don't worry, just fake it and read on)!

Simple Proof: P(A)=P(A|S), where S is the set of all possible outcomes.

From this moment on your probabilistic life will change.

To demonstrate this, examine the next simple example.

Tossing a coin
  • When tossing a coin, we all know: P (heads)=0.5
  • However, we implicitly assumed a 'fair coin', didn't we?
  • So what we in fact stated was: P (heads|fair)=0.5
  • Now a small problem appears on the horizon: We all know a fair coin is hypothetical, it doesn't really exist in a real world as every 'real coin' has some physical properties and/or environmental circumstances that makes it more or less biased.
  • We can not but conclude that the expression
    'P (heads|fair)=0.5'  is theoretical true, but has unfortunately no practical value.
  • The only way out is to define fairness in a practical way is by stating something like:  0.4999≥P(heads|fair)≤0.5001
  • Conclusion: Defining one point estimates in practice is practically  useless, always define estimate intervals (based on confidence levels).

From this beginners  example, let's move on to something more actuarial:

Estimating Interest Rates: A Multi Economic Approach
  • Suppose you base your (ALM) Bond Returns (R) upon:
    μ= E(R)=4%
    and σ=2%

  • Regardless what kind of brilliant interest- generating model (Monte Carlo or whatever) you developed, chances are your model is based upon several implicit assumptions like inflation or unemployment.

    The actual Return (Rt) on time (t) depends on many (correlated, mostly exogenous) variables like Inflation (I), Unemployment (U), GDP growth(G), Country (C) and last but not least  (R[t-x]).

    A well defined Asset Liability Model should therefore define (Rt) more on basis of a 'Multi Economic Approach'  (MEA) in a form that looks more or less something like: Rt = F(I,U,G,σ,R[t-1],R[t-2],etc.)

  • In discussing with the board which economic future scenarios will be most likely and can be used as strategic scenarios, we (actuaries) will be better able to advice with the help of MEA. This approach, based on new technical economic models and intensive discussions with the board, will guarantee  more realistic output and better underpinned decision taking.


Sources and related links:
I. Stats....
- Make your own car crash query
- Alcohol-Impaired Driving Fatalities (National Statistics)
- D r u n k D r i v i n g Fatalities in America (2009)
- Drunk Driving Facts (2006)

II. Humor, Cartoons, Inspiration...
- Jesse van Muylwijck Cartoons (The Judge)
- PHDCOMICS
- Interference : Evolution inspired by Mike West

III. Bayesian Math....
- New Conceptual Approach of the Interpretation of Clinical Tests (2004)
- The Bayesian logic of frequency-based conjunction fallacies (pdf,2011)
- The Bayesian Fallacy: Distinguishing Four Kinds of Beliefs (2008)
- Resource Material for Promoting the Bayesian View of Everything
- A Constructivist View of the Statistical Quantification of Evidence
- Conditional Probability and Conditional Expectation
- Getting fair results from a biased coin
- INTRODUCTION TO MATHEMATICAL FINANCE