Apr 16, 2017

All Models are Wrong

A 2016 paper by James Stuart (et al) from the 'Institute & Faculty of Actuaries' about 'Ersatz Models' (Substitute Models) states....

Al models are deliberate simplifications of the real world. Attempts to demonstrate a model’s correctness can be expected to fail, or apparently to succeed because of test limitations, such as insufficient data.

We can explain this using an analogy involving milk. Cows’ milk is a
staple part of European diets. For various reasons some people avoid it, preferring substitutes, or ersatz milk, for example made from soya. In a chemical laboratory, cows’ milk and soya milk are easily distinguished.
Despite chemical differences, soya milk physically resembles cows’ milk in many ways - colour, density, viscosity for example. For some purposes, soya milk is a good substitute, but other recipes will produce acceptable results only with cows’ milk. The acceptance criteria for soya milk should depend on how the milk is to be used.



In the same way, with sufficient testing, we can always distinguish an
ersatz model from whatever theoretical process drives reality. We should be concerned with a more modest aim: whether the ersatz model is good enough in the aspects that matter, that is, whether the modelling objective has been achieved.

The Model Problem
The paper starts with stories of models gone bad. Can our proposed
generated data tests prevent a recurrence?



The Model Risk Working party has explained how model risks arise not only from quantitative model features but also social and cultural aspects relating to how a model is used. When a model fails, a variety of narratives may be offered to describe what went wrong. There may be disagreements between experts about the causes of any crisis, depending on who knew, or could have known, about model limitations. Possible elements include:
  • A new risk emerged from nowhere and there is nothing anyone could have done to anticipate it - sometimes called a “black swan”.
  • The models had unknown weaknesses, which could have been revealed by more thorough testing.
  • Model users were well acquainted with model weaknesses, but these were not communicated to senior management accountable for the business
  • Everyone knew about the model weaknesses but they continued to take excessive risks regardless.
Ersatz testing can address some of these, as events too rare to feature in actual data may still occur in generated data. Testing on generated data can also help to improve corporate culture towards model risk, as:
  • Hunches about what might go wrong are substantiated by objective analysis. While a hunch can be dismissed, it is difficult to suppress objective evidence or persuade analysts that the findings are irrelevant.
  • Ersatz tests highlight many model weaknesses, of greater or lesser importance. Experience with generated data testing can de-stigmatise test failure and so reduce the cultural pressure for cover-ups.
We recognise that there is no mathematical solution to determine how extreme the reference models should be. This is essentially a social decision.

Corporate cultures may still arise where too narrow a selection of reference models is tested, and so model weaknesses remain hidden.

  1. Source: Ersatz Model Tests
    https://www.actuaries.org.uk/documents/ersatz-model-tests-0
    (Conclusions : page 35)
  2. Model Risk Working Party
    https://www.actuaries.org.uk/documents/sessional-paper-model-risk-daring-open-black-box
  3. The skinny kids are all drinking full cream milk
    http://heffalumpgeneration.co.za/the-skinny-kids-are-all-drinking-full-cream-milk/