Towards a philosophy of prediction IX: assumptions and how predictions break

Nicklas Berild Lundblad

2 years ago

5 min and 42 sec to read, 1426 words

All predictions are made from some model of the predicted phenomenon, and it is sometimes useful to think of that model as a set of assumptions about the world. Some assumptions will be very basic – like assumptions of continuity, uniformity and normalcy, whereas others may be more specific. One way of understanding a prediction, then, is to say that it is composed of assumptions and has the form:

(i) Given assumptions a(1)…a(x) it seems reasonable to predict X.

This simplified form suggests that we can judge the robustness of a prediction by looking more closely at the robustness of the set of assumptions used to make the prediction. What questions, then, should we ask about the assumptions? The obvious ones seem to be things like “is this a reasonably complete set of assumptions?” and “are these assumptions credible?”, but there are also other interesting questions, like the ones suggested by Charles Manski in his Identification for Prediction and Decision (HUP 2009).¹

In his work Manski challenges conventional economic analysis by advocating for a more nuanced approach to uncertainty and prediction. He critiques the reliance on strong assumptions in traditional models, which often lead to precise but potentially unreliable point predictions. Manski distinguishes between weak assumptions, which are more credible but yield less definitive conclusions, and strong assumptions, which produce more precise predictions but are often less realistic. He argues that weak assumptions, while resulting in partial identification and a range of possible outcomes, ultimately enhance the credibility of economic analysis. By contrast, strong assumptions may artificially narrow the range of predictions, potentially misleading decision-makers. Manski promotes the use of partial identification methods, which acknowledge data limitations and produce a spectrum of possible outcomes. This approach, he contends, leads to more robust and honest analysis, enabling policymakers to make better-informed decisions by understanding the full range of potential consequences and the true extent of uncertainty in economic predictions.

This distinction between weak and strong assumptions strikes me as really important, and almost suggests a principle for is to work with – that we should work to achieve the most accurate prediction possible on the weakest assumptions possible. And even if this principle does not always hold, we should state the strength of the assumptions we work on in order to understand better how the prediction can fail.

Of anything it is helpful to ask the very simple question “how does this break?”. This is true for predictions as well, and a large part of the answer to that question for predictions is that they break already in the assumptions. Because the assumptions that we use are often biased, incomplete or wrong – and this goes for all layers of assumptions, from general to specific. Let’s look at a few cases.

Assumptions of normalcy. A simple way for a prediction to break is that we assume that things will largely be normal in some sense. An example would be assuming that someone will arrive as planned, but extreme weather may derail their plans. Assumptions of normalcy are interesting in that they are based on some kind of idea of a state of the world that is labelled as normal – but it is not clear for what set of world variables this is assumed; is it for all variables in the world model? But how normal is it for everything to be normal? Or is it for a majority of the world variables?
Assumptions of uniformity. These are closely related to assumptions of normalcy, and they break when we think that everything in a predicted set is uniform. Say we want to predict how well we will fare against a number of chess players, and that one of them is Magnus Carlsen. If we do not know this we will work with average chess players in our models, and the outlier case of Carlsen will break the prediction.
Assumptions of configuration. We often assume that a problem is configured in a special way, and that this configuration is given for this kind of problem. We may assume, for example, that a problem is a 2 person game, whereas the real configuration is that it is a n-person game where we do not know how many persons are playing the game.
Assumptions of time. There are a set of assumptions that are about pace, rhythm, time needed etc. These assumptions break when there is a step change in the time with which something can be accomplished. The simplest example may be something like Blitzkrieg, where earlier assumptions in military strategy were that it would take a number of days to establish a front, attack and make military maneuvers.

These are just a few of the assumptions that can break in different ways — there are many others, and an interesting exercise is to list your assumptions and try to understand how they can break, and in what ways they are brittle.

How many assumptions should we make when we predict something? Is it better to make many assumptions or will fewer assumptions make for a better prediction? Going back to our toy model above – are predictions more robust when X increases, or what is the relationship between the number of assumptions and the robustness of a prediction? It is tempting here to assume that we should be looking at some kind of decreasing utility curve that grows fast in the beginning and then levels out at some point, but even so it is interesting to think about where that point is.

This seems related to Jezz Bezos’ observations on how much of the information you should aim to have when you make a decision. Bezos writes:

Day 2 companies make high-quality decisions, but they make high-quality decisions slowly. To keep the energy and dynamism of Day 1, you have to somehow make high-quality, high-velocity decisions.
Easy for start-ups and very challenging for large organizations. Speed matters in business.
First, never use a one-size-fits-all decision-making process. Many decisions are reversible, two-way doors. Those decisions can use a light-weight process. For those, so what if you’re wrong?
Second, most decisions should probably be made with somewhere around 70 percent of the information you wish you had. If you wait for 90 percent, in most cases, you’re probably being slow. Plus, either way, you need to be good at quickly recognizing and correcting bad decisions. If you’re good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure.

So, does this mean that you should have 70 percent of the assumptions you would ideally want? That is at least one way of thinking about it. The other approach is to think about assumptions as variables in a Fermi-problem.²

Superforecasters, as identified and studied by Philip Tetlock, often employ Fermi-style reasoning in their predictive work. They apply this method by:

Breaking down complex forecasting questions into smaller, more easily estimable components.
Making educated guesses for each component based on known facts and reasonable assumptions.
Combining these estimates mathematically to arrive at a final prediction.
Iteratively refining their estimates as new information becomes available.

This approach allows superforecasters to make more accurate predictions by leveraging their broad knowledge base, critical thinking skills, and ability to make reasonable approximations. It also helps them to identify key factors influencing outcomes and to adjust their forecasts systematically as circumstances change. The Fermi problem approach aligns well with superforecasters’ tendency to think probabilistically, consider multiple perspectives, and remain open to updating their views – all traits that contribute to their exceptional predictive accuracy.

There is no simple way to determine the optimal number of sub-problems a problem should be broken up into when you Fermize it, but there seems to be an upper bound – breaking a problem up into 100 subproblems is hardly viable. Perhaps we should even think about this as the famed 5+-2 rule of cognitive processing memory?

Footnotes and references

1
See Manski, C.F., 2009. Identification for prediction and decision. Harvard University Press.
2
Fermi problems, named after physicist Enrico Fermi, are estimation exercises that involve breaking down complex questions into smaller, more manageable components. The method works by leveraging the power of decomposition and approximation to arrive at reasonably accurate estimates for seemingly intractable problems. The key principle is that while individual estimates may have significant errors, these errors tend to cancel out when multiple estimates are combined, leading to a surprisingly accurate final result. This approach is particularly effective because it allows problem-solvers to use readily available information and reasonable assumptions to tackle questions where precise data might be lacking.

Superforecasters, as identified and studied by Philip Tetlock, often employ Fermi-style reasoning in their predictive work. They apply this method by:
1. Breaking down complex forecasting questions into smaller, more easily estimable components.
2. Making educated guesses for each component based on known facts and reasonable assumptions.
3. Combining these estimates mathematically to arrive at a final prediction.
4. Iteratively refining their estimates as new information becomes available.
This approach allows superforecasters to make more accurate predictions by leveraging their broad knowledge base, critical thinking skills, and ability to make reasonable approximations. It also helps them to identify key factors influencing outcomes and to adjust their forecasts systematically as circumstances change. The Fermi problem approach aligns well with superforecasters’ tendency to think probabilistically, consider multiple perspectives, and remain open to updating their views – all traits that contribute to their exceptional predictive accuracy.

Footnotes and references

Dela detta: