There are now a number of reviews and summing up notes about the most impactful papers in AI in 2020. Data Science Central has a nice one – and also links to a few others. It is interesting to look at these to see what is highlighted and what is missing. I, for one, was surprised to see that the list from DSC does not seem to include any applications of AI to scientific work; it is mostly about language – from chatbots to GPT – and yet I believe it is not out of the question that applications of artificial intelligence in science will be the most interesting and fundamental change we will see in the short term.
The idea that artificial intelligence can be applied in science is not new. Stanislaw Lem believed that it was one of the core reasons to develop what he called “intelectronics” and he noted that artificial scientists were one of the most promising ways in which we could deal with the inevitable shortage of scientific attention we increasingly face.
Lem’s argument can be translated into a simple model. Imagine the sum of our knowledge in the world is represented by a sphere and that we are in the midst of that sphere. As we learn more about the world the sphere grows – leading to an interesting search problem. Let us imagine that learning is essentially a search problem across the surface of the sphere and that scientific insights that allow us to expand the sphere are randomly distributed across the surface of the sphere and that the number of such insights that allow for us to expand the sphere are constant – then the search problem grows with the area of the sphere.
The set of eligible scientific problems we can work on grows at one pace P, and the available scientific attention at another S. As long as P grows faster than S, the result will be a slowing down of the growth of science.
Lem realized this well, and the same insight has been brought up again and again by, for example Nicholas Nassim Taleb as he discusses the tragedy of big data – where exponentially growing data sets see us struggling to sift out spurious correlations from meaningful causation at the same mismatch rates.
In its simplest form it is the problem Herbert Simon was addressing when we noted that with a wealth of information comes a scarcity of attention, and a need to allocate attention efficiently. Simon, like Lem, realized that the reality is that we need to build tools – both Simon and Lem believed that artificial intelligence was the key tool here – to help us deal with the tension between information and attention.
The way Simon formulated the problem downplays the dynamic, however. The rates at which wealth of information grows and attention is spent are key – and there are boundary scenarios in which the discrepancy between the two becomes disastrous. Any search problem across a certain space where the space grows too fast and the solutions become too dispersed is unsolvable. Imagine searching in not just a dancing fitness landscape, but in a dancing and rapidly expanding fitness landscape — that is the kind of problem we are looking at now: where the move from a local maxima to any other point involves traveling across expanding deserts of suboptimality.
Our best bet is thinking through how artificial intelligence can be turned into a scientific tool, re-imagining the scientific method with new technologies. This will require some interesting epistemological work as well — Lem notes that we are close to a point where understanding will be decoupled from predicting. We will be able to predict systems without understanding how we do it. This in turn will require that we re-examine the notion of explaining. When is a phenomenon explained — is it when we can understand and predict it or is prediction sufficient?
The notion of p-explanations and u-explanations will also challenge how we think about science overall, if we need to make the distinction.
This is a long way of saying that I would have expected more articles around artificial intelligence in science in these lists (this comes to mind), and this makes me think that the impact AI will have not on writing texts or chatting, but on scientific exploration, is undervalued. There is a connection here with the interesting thinking pattern of observation that I wrote about in my first newsletter as well — can we make computers not think so much as observe?