How to set an adequate sample size in case of a compliance check?
This simple question has ultimately a simple answer, but can become a "
mer à boire" (nightmare) in case of a 'classic' sample size approach.....
In my last-but-one blog called
'Pisa or Actuarial Compliant?', I already stressed the importance of checking compliance in the actuarial work field.
Not only from a actuarial perspective compliance is important, but also from a core business viewpoint:
Compliance is the main key driver for sustainable business
Minimizing Total Cost by Compliance
A short illustration: We all know that compliance cost are a part of Quality Control Cost (QC Cost) and that the cost of NonCompliance (NC Cost) increase with the noncompliance rate.
Mainly 'NC cost' relate to:
- Penalties or administrative fines of the (legal) regulators
- Extra cost of complaint handling
- Client claims
- Extra administrative cost
- Cost of legal procedures
Sampling costs - on their turn - are a (substantial) part of QC cost.
More in general now it's the art of good practice compliance management, to determine
that level of maximal noncompliance rate, that minimizes the total cost of a company.
Although this approach is more or less standard, in practice companies revenues depend strongly on the level of compliance. In other words: If compliance increases, revenues increase and variable costs decrease.
This implies that introducing 'cost driven compliance management' - in general - will (1) reduce the total cost and (2) mostly make room for additional investments in 'QC Cost' to improve compliance and to lower variable and total cost.
In practice you'll probably have to calibrate (together with other QC investment costs) to find the optimal cost (investment) level that minimizes the total cost as a percentage of the revenues.
As is clear, modeling this kind of stuff is no work for amateurs. It's real risk management crafts-work. After all, the effect of cost investments is not sure and depends on all kind o probabilities and circumstances that need to be carefully modeled and calibrated.
From this more meta perspective view, let's descend to the next down to earth 'real life example'.
'Compliance Check' Example
As you probably know, pension advisors have to be compliant and meet strict federal, state and local
regulations.
On behave of the employee, the sponsoring employer as well as the insurer or pension fund, all have a strong interest that the involved 'Pension Advisor' actually is, acts and remains compliant.
PensionAdvice
A professional local Pension Advisor firm,
'PensionAdvice' (
fictitious name), wants 'compliance' to become a 'calling card' for their company. Target is that 'compliance' will become a
competitive advantage over its rivals.
You, as an actuary, are asked to advise on the issue of how to verify
PensionAdvice's compliance....... What to do?
- Step 1 : Compliance Definition
First you ask the board of PensionAdvice what compliance means.
After several discussions compliance is in short defined as:
- Compliance Quality
Meeting the regulator's (12 step) legal compliance requirements
('Quality Advice Second Pillar Pension')
- Compliance Quantity
A 100% compliance target of PensionAdvice's portfolio, with a 5% non-compliance rate (error rate) as a maximum on basis of a 95% confidence level.
The board has no idea about the (f)actual level of compliance. Compliance was- until now - not addressed on a more detailed employer dossier level.
Therefore you decide to start with a simple sample approach.
- Step 2 : Define Sample Size
In order to define the right sample size, portfolio size is important.
After a quick call PensionAdvice gives you a rough estimate of their portfolio: around 2.500 employer pension dossiers.
You pick up your 'sample table spreadsheet' and are confronted with the first serious issue.
An adequate sample (95% confidence level) would urge a minimum of 334 samples. With around 10-20 hours research per dossiers, the costs of this size of this sampling project would get way out of hand and become unacceptable as they would raise the total cost of PensionAdvice (check this, before you conclude so!).
Lowering confidence level doesn't solve the problem either. Sample sizes of 100 and more are still too costly and confidence levels of less than 95% are of no value in relation to the clients ambition (compliance= calling card).
The same goes for higher - more than 5% - 'Error Tolerance' .....
By the way, in case of samples for small populations things will not turn out better. To achieve relevant confidence levels (>95%) and error tolerances (<5%), samples must have a substantial size in relation to the population size.
You can check all this out 'live', on the next spreadsheet to modify sampling conditions to your own needs. If you don't know the variability of the population, use a 'safe' variability of 50%. Click 'Sample Size II' for modeling the sample size of PensionAdvice.
- Step 3: Use Bayesian Sample Model
The above standard approach of sampling could deliver smaller samples if we would be sure of a low variability.
Unfortunately we (often) do not know the variability upfront.
Here comes the help of a method based on efficient sampling and Bayesian statistics, as clearly described by Matthew Leitch.
A more simplified version of Leitch's approach is based on the Laplace's famous 'Rule of succession', a classic application of the beta distribution ( Technical explanation (click) ).
The interesting aspects of this method are:
- Prior (weak or small) samples or beliefs about the true error rate and confidence levels, can be added in the model in the form of an (artificial) additional (pre)sample.
- As the sample size increases, it becomes clear whether the defined confidence level will be met or not and if adding more samples is appropriate and/or cost effective.
This way unnecessary samples are avoided, sampling becomes as cost effective as possible and auditor and client can dynamically develop a grip on the distribution. Enough talk, let's demonstrate how this works.
Sample Demonstration
The next sample is contained in an Excel spreadsheet that you can
download and that is presented in a simplified spreadsheet at the end of this blog. You can modify this spreadsheet (on line !) to your own needs and use it for real life compliance sampling. Use it with care in case of small populations (n<100).
A. Check on the prior believes of management
Management estimates the actual NonCompliance rate at 8% with 90% confidence that the actual NonCompliance rate is 8% or less:
If management would have no idea at all, or if you would not (like to) include management opinion, simply estimate both (NonCompliance rate and confidence) at 50% (= indifferent) in your model.
B. Define Management Objectives
After some discussion, management defines the (target) Maximum acceptable NonCompliance rate at 5% with a 95% confidence level (=CL)
C. Start ampling
Before you start sampling, please notice how prior believes of management are rendered into a fictitious sample (test number = 0) in the model:
- In this case prior believes match a fictitious sample of size 27 with zero noncompliance observations.
- This fictitious sample corresponds to a confidence level of 76% on basis of a maximum (population) noncompliance rate of 5%.
[
If you think the rendering is to optimistic, you can change the fictitious number of noncompliance observations from zero into 1, 2 or another number (examine in the spreadsheet what happens and play around).]
To lift the 76% confidence level to 95%, it would take an additional sample size of 31 with zero noncompliance outcomes (you can check this in the spreadsheet).
As sampling is expensive, your employee Jos runs a first test (test 1) with a sample size of 10 with zero noncompliance outcomes. This looks promising!
The cumulative confidence level has risen from 76% to over 85%.
You decide to take another limited sample with a sample size of 10. Unfortunately this sample contains one noncompliant outcome. As a result, the cumulative confidence level drops to almost 70% and another sample of size 45 with zero noncompliant outcomes is necessary to reach the desired 95% confidence level.
You decide to go on and after a few other tests you finally arrive at the intended 95%cumulative confidence level. Mission succeeded!
The great advantage of this incremental sampling method is that if noncompliance shows up in an early stage, you can
- stop sampling, without having made major sampling cost
- Improve compliance of the population by means of additional measures on basis of the learnings from the noncompliant outcomes
- start sampling again (from the start)
If - for example - test 1 would have had 3 noncompliant outcomes instead of zero, it would take an additional test of size 115 with zero noncompliant outcomes tot achieve a 95% confidence level. It's clear that in this case it's better to first learn from the 3 noncompliant outomes, what's wrong or needs improvement, than to go on with expensive sampling against your better judgment.
D. Conclusions
On basis of a prior believe that - with 90% confidence - the population is 8% noncompliant, we can now conclude that after an additional total sample of size 65,
PensionAdvice's noncompliance rate is 5% or less with a 95% confidence level.
If we want to be 95% sure without 'prior believe', we'll have to take an additional sample of size 27 with zero noncompliant outcomes as a result.
E. Check out
Check out, download the next spreadsheet. Modify sampling conditions to your own needs and
download the Excel spreadsheet.
Finally
Excuses for this much too long blog. I hope I've succeeded in keeping your attention....
Related links / Resources
I. Download official Maggid Excel spreadsheets:
-
Dynamic Compliance Sampling (2011)
-
Small Sample Size Calculator
II. Related links/ Sources:
-
'Efficient Sampling' spreadsheet by Matthew Leitch
-
What Is The Right Sample Size For A Survey?
-
Sample Size
-
Epidemiology
-
Probability of adverse events that have not yet occurred
-
Progressive Sampling (Pdf)
-
The True Cost of Compliance
-
Bayesian modeling (ppt)