4 min and 53 sec to read, 1219 words
What is the relationship between predictions and explanations? Our first intuition may be to say that predictions are forward-looking and explanations directed towards the past, but that is not quite right. Are predictions in some sense not thinner than explanations? I can predict that something will happen without actually understanding why. I cannot explain something without understanding it, however — so prediction and explanation would seem to require different depths of understanding.
Predictiv understanding is thin in the sense that what it understands is correlation, whereas explanatory understanding seems to understand causality, and perhaps even causality in a world model. How does it work in the inverse? Does the impossibility of predicting something mean that I do not understand it at all? If so the limits of my predictive capability seem to be the limits of my understanding, or, perhaps of my world.
In professor Gali Shmueli’s work on the difference between explanatory and predictive modeling in statistics we find a set of useful distinctions.1 See e.g. Shmueli, G., 2010. To explain or to predict? Available at https://www.stat.berkeley.edu/~aldous/157/Papers/shmueli.pdf Shmueli writes:
As a discipline, we must acknowledge the difference between explanatory, predictive and descriptive modeling, and integrate it into statistics education of statisticians and nonstatisticians, as early as possible but most importantly in “research methods” courses. This requires creating written materials that are easily accessible and understandable by
nonstatisticians. We should advocate both explanatory and predictive modeling, clarify their differences and distinctive scientific and practical uses, and disseminate tools and knowledge for implementing both.
What then is the difference between the two? According to Shmueli, explanatory modeling and predictive modeling differ fundamentally in their goals, approaches, and outcomes. Explanatory modeling aims to test causal theories and hypotheses, focusing on minimizing bias and understanding underlying mechanisms. It typically uses theory-driven variable selection, interpretable statistical models, and evaluates performance based on goodness-of-fit and statistical significance. In contrast, predictive modeling seeks to accurately forecast new or future observations, balancing the trade-off between bias and variance to optimize predictive power. It often employs data-driven variable selection, may use complex or “black-box” algorithms, and evaluates performance using out-of-sample prediction accuracy. This distinction affects the power of the resulting models in that explanatory models may have high explanatory power (e.g., high R-squared) but poor predictive accuracy on new data, while predictive models may achieve high predictive accuracy without necessarily providing clear causal insights. Shmueli argues that these two types of power – explanatory and predictive – should be viewed as separate dimensions rather than extremes on a continuum, and that both are valuable for scientific progress in different ways.
Her perspective here mirrors that of David Krakauer in an interesting way. Krakauer noted on a recent podcast that we are looking at a coming schism in science2 See https://complexity.simplecast.com/episodes/106/transcript :
And I, we’re now entering in the 21st century a new kinda scientific schism where we’re gonna live with two very different ways of engaging with reality —a machine-based, high-dimensional, very precise, predictive framework that is a black box and ours, which is a more familiar framework from the history science, if you like, but that is faithful to the complexity of the systems we study, which doesn’t predict so well, but does allow us to understand the basic mechanisms generating the phenomena of interest. And that’s where I think complexity lives. And it’s gonna have to come to terms with living with machine learning and AI. It’s almost as if we’ve returned, to use your biblical metaphors, to the Cain and Abel and those two brothers are gonna have to get on as opposed to one killing the other.
This idea, of one predictive and one explanatory view of the world that needs to co-exist, of two dimensions as Shmueli has it, is fascinating and could be read as the evolution of the split between empiricist and rationalist traditions. Empiricists evolved into machine learning predictors, where as rationalists evolved into complexity explainers. And this suggests that our early distinction is not quite right — prediction is not thinner than explanation, it merely approaches the world from another dimensions (we could say from below and from above, but that suggests a difference that may not be helpful).
And we need both, as both Krakauer and Shmueli suggest, and we should teach both in schools as well. To understand something is to be able both to predict and explain it – and when we cannot do both our understanding is limited.
Is there something here about the historical balance between these two aspects of science? Can we say that historically our science has been explanation heavy, but lately it is becoming much more prediction heavy? If we look at the composition of our knowledge of the world – our understanding of the world – is it primarily explanatory or predictive? Has that changed over time?
If so — should we say that we are moving from an explanatory paradigm to a predictive worldview?
It is intriguing to think about the extremes. Let’s imagine a world where explanations never evolved – their science is all predictions. When we ask why something happens they merely shrug and say that is a nonsense question — explanations in their view are just stories we tell about complex systems to make them more “human”. For them prediction, by complex computational models, is knowledge and understanding.
What, then, would they be missing?3 It is instructive to think about other institutions in a society like this — what would, for example, courts look like if they were limited to predictive modeling? Would they focus wholly on recidivist probabilities or probabilities that an action unpunished would lead to more such actions and entirely base judicial decision on predictions? And what would art look like? Literature?
And what about the inverse world in which we find a science that has no predictions, and only relies on explanations – a science that considers predictions as magic and nonsense. Everything, they say, can be explained in hindsight but is so complex that a prediction can always be wrong, and hence making predictions is useless and intellectually dishonest.
What would this culture be missing?
Or could we imagine a world in which the two co-exist but rarely interact? A school of explainers and one of predictors considering the other vaguely unintellectual (is this the split between the two cultures that Snow discusses? Or did he center on the distinction between understanding and explaining? Should that whole debate really be recast as one in which we should actually have been discussion prediction and understanding all the time.)
In some ways it seems reasonable to argue that we are moving from the second to the first, the aristotelian, from first principles medieval scientific paradigm could be cast as deeply explanatory, and the current machine learning inspired modeling as predictive — but is that right? Is there a further dimension we need to think about here?
Finally, returning to the point about the limits: what can we say about the limits of the both approaches? Where does explanation hit limits and what limits prediction more exactly? And are these limits to knowledge different in some fundamental ways? It seems as if they are — a limit of explanation seems harder to define than a limit to prediction, but why is that?
Footnotes and references
- 1See e.g. Shmueli, G., 2010. To explain or to predict? Available at https://www.stat.berkeley.edu/~aldous/157/Papers/shmueli.pdf
- 2
- 3It is instructive to think about other institutions in a society like this — what would, for example, courts look like if they were limited to predictive modeling? Would they focus wholly on recidivist probabilities or probabilities that an action unpunished would lead to more such actions and entirely base judicial decision on predictions? And what would art look like? Literature?
Leave a Reply