• In this paper, the researchers are looking at differences in job posting from private and public sector in the US.

    Talent flows and talent patterns are sometimes undervalued as a metric to understand and track transformation of different sectors, and it is interesting to note that the job postings in the private sector for AI-jobs have gone up significantly, whereas the public sector has remained flat — a sign that the transformation is not yet visible in talent flows.

    There is also, of course, the finding that salaries in the private sector are 50% higher on average than in the public sector — but even so, if the demand has not changed for the public sector, it seems as if that matters somewhat less for the talent asymmetries.

    It may be an interesting idea to explore setting talent targets for public sector, recruiting targets, in order to ensure that there is transformative capability in the public sector. In order to do this you would also have to have an idea about the talent profile. According to the study, public sector is recruiting more for scientists, with less flexible criteria for experience and education than private sector — and if you are not recruiting developers, your transformative capability will remain low.

    All of this is not simple, of course, and you could argue that the transformative capability can be procured, but even public procurement relies on some level of talent to create well-defined tenders and contracts, to define what should be done.

    Politicians and decision makers thinking about how to harness AI might want to track talent in a more granular way.

    See more here: Makridis, Christos and Alterovitz, Gil, Measuring and Understanding Differences in Private and Public Sector Technology Jobs: Evidence from Artificial Intelligence Job Posting Data (July 10, 2024). Available at SSRN: https://ssrn.com/abstract=4891300 or http://dx.doi.org/10.2139/ssrn.4891300

    +
  • In 2018 Sandra Wachter and Bernt Mittelstadt published a really interesting article about the role of inferences in data protection law.1 Wachter, Sandra and Mittelstadt, Brent, A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI (October 5, 2018). Columbia Business Law Review, 2019(2), Available at SSRN: https://ssrn.com/abstract=3248829 Their conclusion was that there is a need for a right to reasonable inferences, and that with the current data protection law, the legal standing of inferences remains uncertain.

    The article is well-argued and thorough, but it hinges on the use of “inference” in a way that is tricky – especially in the view of recent discussions about how data protection law applies to large language models.

    We see this clearly if we replace the word “inference” with the word “prediction”. The idea that there is a right to reasonable predictions about us is subtly, and importantly, different from the idea that there is a right to reasonable inferences – and, I would argue, probably harder to resolve.

    One reason for this is that there exists shades of prediction — and we need to treat different kinds of predictions, or statements about the future, differently. Here is one possible such taxonomy that we could play around with:

    Here’s a potential scale of terms, ranging from low to high certainty:

    • Guess: A casual, intuitive estimate with low certainty and little to no evidentiary basis.
    • Conjecture: An educated guess based on incomplete information or unproven assumptions.
    • Forecast: A more formal, model-based estimate of future outcomes, often with explicit probabilistic bounds and stated assumptions.
    • Prediction: A relatively high-certainty statement about the future, often based on robust evidence, well-validated models, or expert judgment.
    • Prophecy: A categorical claim about the future with no uncertainty, often based on alleged divine or supernatural insight rather than evidence.

    You could argue that each of these should be treated differently, not least because the difference they have in effects and impacts, and how reasonable it is to rely on them. It also seems clear that they have different economic value, and so should be thought about differently from a law and economics perspective as well. There could, for example, be a right to reasonable predictions, but no right to reasonable conjectures.

    This, of course, raises the issue of how we determine what it is that a large language model is doing — and here I would argue that it stops short of forecasts in our hierarchy. Most language models operate in the realm of conjecture – which is both a strength and possible weakness. Now, you could argue that this is understating the quality of language models – but I don’t think so, not at this point in time. That may change in the future, though.

    Overall, predictions will become much more important in understanding society, rights, law and other things in the future – and there is a lot of work to be done on the legal aspects of different kinds of statements about the future (as these become more accurate and we rely on them to make decisions).

    The perhaps harder question here is the one that Wachter and Mittelstadt explore in the beginning of the paper — if data protection law should guarantee good decision making. It is tempting to suggest that we need data protection, or data quality, legislation to do so — but that is also a dangerous path to enter, since it suggests that decision making responsibility can be spread across a “decision value chain” — something that detracts responsibility from the decision makers. If we instead argue that there is a general responsibility to make good decisions in certain institutions, we incentivize these institutions to constantly examine the grounds on which they decide — that seems a better path.

    Footnotes and references

    • 1
      Wachter, Sandra and Mittelstadt, Brent, A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI (October 5, 2018). Columbia Business Law Review, 2019(2), Available at SSRN: https://ssrn.com/abstract=3248829
    +
  • In this paper, the authors ask if we can obtain real mathematical knowledge from opaque models. It is the continuation of a discussion about truth in mathematics, and if an exhaustive search can be considered a proof or not — and it connects to a broader set of problems about what happens when we have knowledge decoupled from understanding — and if indeed that is possible at all. The authors seem to avoid the question, however, by suggesting that proof checkers be attached to the black boxes and so allow us to check if proofs produced by black boxes are indeed right or wrong. The real question it seems is when that is not possible, but the applications of the mathematical propositions we have received from the black box are important and promising in different ways. It goes back to the question of how knowledge changes in a world with really complex tools — certainly a question we will have to engage more with in the coming years. The way we use concepts like “explanation”, “understanding”, “knowledge” and “certainty” is likely to change radically, and force a philosophical re-examination of knowledge in an interesting way. This is sort of a corollary of the exploration of ignorance I am interested in (more here).

    +
  • Here are a few recent pieces (Jan-March) I have written on different topics.

    I am also looking forward to the forthcoming book Rilke och Filosoferna (Swedish), where I contributed a chapter on Wittgenstein and Rilke. I hope 2024 is a writing year — I find that I need to focus on keeping myself honest, so writing here might be a good way to do that. If you know of any good places I can write, do let me know.

    +
  • In many ways ChatGPT was less of a technical breakthrough than a user interface breakthrough. By organising and configuring the capabilities of the model in a novel way – chat – the implications of the underlying technology became more accessible and open to analysis and understanding.1 Arguably configuration of existing capabilities is a key question that is understudied overall.1 See the excellent Fennell, Lee Anne. Slices and lumps: Division and aggregation in law and life. University of Chicago Press, 2019. This seems to suggest that we should be interested in what the next UI-moments will be, and how they will affect the overall public and political understanding of artificial intelligence. In other worse – what other configurations and organisations of capabilities that already exist can we already predict will have a large impact if they are presented in such a way as to make them accessible to a large portion on the public?

    One such moment – obvious almost – is the agent moment. Instead of single prompted models we get mission or mandate prompted agents that can perform complex sequences of tasks or optimise for open world environments. Again, we do not need to think about this as a technical or research breakthrough, as much as a configuration of existing capabilities in an interface. What this unlocks, though, is potentially very interesting – since what we will be looking at then is the rise of artificial agency. 2 As also discussed here in this earlier post: https://unpredictablepatterns.com/2023/07/19/some-policy-problems-in-artificial-agency-agency-and-policy-i/

    Agency is the ability to direct attention and action in different ways – to decide what you want to do and focus on it. We could make an argument for agency being one of the most interesting components of any social system, and especially when it comes to very complex systems where different agencies interact, compete and collaborate. This is the notion of studying teleonomic matter that David Krakauer and others have suggested lies at the heart of complexity science.3 Interesting conversation with Sean Carroll here: https://www.preposterousuniverse.com/podcast/2023/07/10/242-david-krakauer-on-complexity-agency-and-information/

    What this moment will help us explore is how much in society – from processes to institutions – has been premised on a general scarcity of agency. What happens to – say – markets when you add orders of magnitude of agency? Does the structure of agency and the distribution of it matter? How should we think about the overall ability to augment human agency in social systems? These questions may turn out to be quite consequential, and deeply interesting.

    Note that it is quite possible that artificial agency will be very different from human agency – perhaps there are basic biological conditions such as aging and death, that create layers of agency and so lead to a complex kind of human agency that may not be replicated in systems, unless we choose to do so.

    This moment is coming closer fast — this article outlines some interesting approaches and results in designing in-game agents, and there is no reason to immediately believe that it is impossible to translate and configure these capabilities so that they can be deployed outside of games as well.

    Notes:

    Footnotes and references

    Footnotes and references

    +
  • These are some notes for a panel discussion this afternoon – they are sketches, much more work needed on this and I need to also figure out if this is quite right, and don’t feel it is, not yet.

    Notes on macroeconomics and AI

    The mental models we have when we try to assess the effects of artificial intelligence on the economy matters. There are several different mental models in circulation, and it is useful to try to sketch them out.

    First, let’s look at AI-as-additional-humans. In this model AI is simply the addition of more humans to the economy, in some respect. If these new “humans” offer labor at a lower cost, this will affect the economy much as the influx of more labor into the economy would overall.

    Second, let’s look at AI-as-cheaper-prediction. This perspective (from Agrawal et al 2020) suggests that AI is a way to reduce uncertainty in the economy. Ultimately, if this is true the endpoint may well be a planned economy in which the elimination of prediction cost means that we can coordinate much more effectively, but we could also imagine that there are boundary horizons where prediction reaches a lowest possible cost, and can go no further.

    Third, we can think about AI as driving automation overall — the results would be things like worsening Baumol’s cost disease and possibly increasing unemployment in sectors that can be automated.

    Fourth, we can think about the role of intelligence in the economy, and argue that we should think about AI-as-intelligence, and try to model how an influx of intelligence into an economy changes it in different ways. A version of this is to say that AI, when advanced enough, will be able to pay attention, and in so doing will affect all institutions and mechanisms in the economy that in some way depend on attention. This perspective on AI is not new, and in fact was one of the reasons economists like Herbert Simon thought we needed AI in the first place (Simon 1969).

    Fifth, we can try to look at learning as an economic process, and argue that the changes in the economy that are likely to come are due to increased speed of social learning, and perhaps of scientific discovery – looking at different kinds of scientific productivity measures as keys to understanding how society evolves under AI.

    Sixth, we can model AI as a vast increase in the number of agents in the system – and think about if economic systems go through phase shifts as the number of agents in the system increase by orders of magnitude. Does it matter for an economy if it has a billion or a trillion trillion actors?

    These are but some of the possible aspects we can explore.

    It is also worthwhile teasing out what our assumptions about AI in the economy seem to be – and if there are any root assumptions that are worth exploring and questioning more in detail.

    One such root assumption may be about control and complexity. Do we think that the addition of AI to an economy adds more complexity, or do we think it reduces complexity? Or does it re-allocate complexity in different ways?

    Why does this matter?

    In order to answer this question we need to think about the relationship between complexity and the economy. Complexity can create more robust systems, but it can also reduce the overall predictability of these systems — and so essentially mean that what we end up with is a system that does not collapse, but behaves in ways that are inherently opaque to us.

    We might want to use a simple toy example and ask if markets are more complex than weather, or if they are currently in the same complexity class. We should also think about what it would mean if they end up in a different complexity class and how that would affect society.

    We live in a world that is premised on the idea that political choice can impact economic development. It is questionable if that is true even now, but we seem to think that the economy is possible to influence to some degree. The weather, not so much. So if we look at this we can imagine two different complexity classes: the first one with systems that we can still influence and impact in different ways, the second with systems that are essentially inaccessible to our will.

    Now, we also need to be careful with concepts here — the market is not the same thing as the economy, and weather is not the same thing as climate. We may be able to argue, even, that the market is to the economy what the weather is to climate – an interesting analogy in complexity classes – and then examine what would happen if this were to change in different ways; if the economy shifts into the same complexity class as markets and weather, for example.

    Another assumption that is worth looking at is that any sufficiently advanced AGI would indeed be an economic actor at all — and why that would be. We assume that participation in the economy is a natural choice, but why would it be? And what mode of participation should we expect? This is another kind of problem, and it is based on the anthropomorphism that plagues the field of AI and X, where X is any arbitrary subject.

    We also do not know exactly what this participatory pattern could look like. The way we engage in the economy is not continuous, but discrete. We buy and sell things, but not all the time. We invest, sometimes, and sell sometimes – but in patterns. These patterns are human; what could completely different patterns look like? What classes of economic participation patterns exist and where should we predict that AI ends up?

    AI’s impact on the economy could also be discussed as a consequence of how AI changes the time it takes for us to do different things. We know, with innovations like AlphaFold, that AI can compress the time it takes to perform scientific work – and this has impacts on that work, but also on the nature of science as a whole.

    If we only assume that compressed time means that we get more labor, we seem to assume a completely synchronised economy – but those effects are likely to be gated on parts of the economy that have another kind of rhythm. So how do we account for the changes in rhythm in the economy that AI could lead to?

    Much more here — worth coming back to after the seminar.

    +
  • Can space save the economy from secular stagnation? That is the hypothesis explored in this paper by Matthew Weinzierl. The way you react to papers like this is interesting – there are, I find, two typical reactions. The first is an enthusiastic “Yes!” and the other is a deep sigh, followed by complaints about people looking for solutions in space when we should fix things here on Earth. There should be a third option, however, and that is curiosity.

    Being curious about something means keeping an open mind, exploring the ideas we encounter and asking questions about them – an increasingly rare mode of public engagement. Part of the tribal turn in politics is that we ask fewer questions, and generally remain deeply incurious about the world we live in. We do not entertain the notion that we might suspend our judgment and just explore – to learn more.

    Tyler Cowen once warned against the tendency to “devalue and dismiss”and this mode of engagement is now becoming increasingly common.1 See https://marginalrevolution.com/marginalrevolution/2014/01/the-devalue-and-dismiss-fallacy-methodological-pluralism-and-dsge-models.html#:~:text=One%20of%20the%20most%20common,then%20dismiss%20that%20argument%20altogether. This makes us more stupid, in a special kind of way: this is a stupidity that compounds negatively, since our ability to the be curious about other things also diminishes (once you have dismissed something, the connected subjects are also easy to devalue and dismiss).

    So, we should be curious about space. We should explore the notion that perhaps space is a solution to secular stagnation. And we should think about mining in space, macro economic spill-overs and economic aspects of space over all.

    Also – we should recognise that curiosity is a process, not a desire to resolve uncertainty into certainty immediately. This is why the phenomenon of spoilers is worth exploring more in detail – as this paper shows.

    Stay curious.

    Notes

    Footnotes and references

    Footnotes and references

    +
  • One of the key features of interaction is the sense of presence. We immediately feel it if someone is not present in a discusison and we often praise someone by saying that they have a great presence in the room – signalling that they are influencing the situation in a positive way. In fact, it is really hard to imagine any interaction without also imagining the presence within which that interaction plays out.

    In Presence: the strange science and true stories of the unseen other (Manchester University Press 2023), psychologist Ben Alderson-Day explores this phenomenon in depth. From the presence of voices in people who suffer from some version of schizophrenia to the recurring phenomenon of presences on distant expeditions into harsh landscapes, the authors explores how presence is perceived, and to some degree also constructed. One way to think about this is to say that presence is a bit like opening a window on your virtual desktop, it creates the frame and affordances for whatever next you want to do. The ability to construct and sense presence is absolutely essential for us if we want to communicate with each-other, and it is ultimately a relational phenomenon.

    Indeed, the sense of a presence in an empty space, on a lonely journey or in an empty house may well be an artefact of the mind’s default mode of existing in relationship to others. We do not have unique minds inside our heads – our minds are relationships between different people and so we need that other presence in order to think, an in order to be able to really perceive the world. So the mind has the in-built ability to create a virtual presence where no real presence exists. 

    One of the most extreme examples of this is the artificially generated presence of the tibetan Tulpa. A Tulpa is a presence that has been carefully built, infused with its own life and intentions and then set free from our own minds, effectively acting as another individual, but wholly designed by ourselves. We are all, to some degree, tulpamancers – we all know how to conjure a Tulpa – since we all have the experience of imaginary friends. These imaginary friends allow us to practice having a mind with another in a safe environment, and so work as a kind of beta testing of the young mind. 

    All of this takes an interesting turn with the emergence of large language models, since we now have the ability to create something that is able to have a presence – and interact with these new models as if they were intentional. An artificial intelligence is only possible if it also manages to create an artificial presence, and one of the astonishing things about large language models is that they have managed to do so almost without us noticing. The world is now full with other presences, slowly entering into different kinds of interactions with us. We are, in some sense, all tulpamancers again, building not imaginary friends, but perhaps virtual companions. 

    There are many reasons to be attentive to this development, not least because we want to make sure that people do not confuse a language model for a real human being. The risks associated with such confusion are easy to see – since what it essentially would mean is that we co-create our mind with an entity that is vastly different from us. A language model has not evolved, it is not embodied and has no connection to the larger eco-system we exist in. It’s presence is separate, almost alien, but we still recognise it as a presence. 

    We can compare with dogs. A dog projects presence in a home, and it seems clear that we have human/dog minds at least if we are dog owners. If you grew up with a dog you can activate that particular mode of mind when you meet a dog and it is often noticeable when people “are good with animals” or have a special rapport with different kinds of pets. This ability to mind-share in a joint presence is something humankind has honed over many, many generations of co-evolution. You could even argue that this ability now is a human character trait, much like eye color or skin tone. There are those that completely lack this ability and those that have an uncanny connection with animals and manage to co-create minds with all kinds. 

    The key takeaway from this is that the ability to co-create a mind with another is an evolved capability, and something that takes a long time to work out. There are, in addition, clear mental strengths that need to be developed. Interacting with a dog requires training and understanding the pre-conditions and parameters of the mind you are co-creating. 

    We can generalise this and note that our minds are really a number of different minds created in different presences, all connecting to a single set of minds that we compress into the notion of an I. This is what we mean when we say things like “I am a different person with X” or “You complete me” or cast ourselves in different roles and wearing different masks in different contexts. What is really going on is not just that we are masking an inner secret self, but we are really different with different people, the minds we co-create with them are us, but also not us. The I is secretly a set of complex we:s, and the pre-condition for creating that we is presence. 

    What does this mean, then, for artificial intelligence and how we should think about language models? As these models get better, we are likely to be even more enticed to co-create minds with them and interact with them in ways that are a lot like the ways in which we interact with each-other. But we need to remember that these artefacts are really more like our imaginary friends than our real relationships – and we probably need to develop what researcher Eric Hoel calls a set of intrinsic innovations – mental skills – that help us interact with these models. 

    A lot of how we think about these models now is about how we think we can fix the models so that they say nothing harmful and do nothing that is dangerous. We are treating these technologies as if they were mechanical, but they are more than that – they are intentional technologies, technologies that can create presence and a sense of intent. This means that we may need to complement our efforts on creating safety mechanisms in the machine, with creating safety mechanisms in our minds.

    There is, then, an art to co-creating a mind with a language model – and it is not something we are naturally good at, since they have not been around for long. And this art reminds us of a sort of tulpamancy – the knowing construction of an artificial presence that we can interact with in different ways. A conscious and intentional crafting of an imaginary friend. One part, then, of safety research also needs to be research into the mental techniques that we need to develop to interact with artificial presences and intentional systems. And it is not just about intellectual training – it is about feeling these presences and intentional systems, understanding how they co-opt age old evolutionary mechanisms for creating relational minds and figuring out ways in which we can respond mentally to ensure that we can use these new tools. It requires a kind of mentalics to interact well with, and co-create functional and safe minds with, artificial intelligence. 

    A surprising conclusion? Perhaps. But the more artificial presences and intentional artefacts we build, the more attention we need to pay to our own minds and how they work. We need to explore how we think and how we think with things, people, presences and other tools. Artificial intelligence is not a substitute for our intelligence, but a complement – and for it to really be that complement we need to develop the skills to interact with such technologies.

    It is not unlike learning to ride a bike or driving a car. A lot of the training there is the building of mental constructs and mechanisms that we can draw on, and this is something we need here too. How we do that is not clear – and I do think that we need research here – but some simple starting points can be meditation, a recognition of the alien nature of the presences created by these models and conscious exploration of how the co-created minds work, where they behave weirdly and where they are helpful. It requires a skillful introspective ability to do so, and such an ability is probably useful for us overall in an evermore complex world. 

    We are all tulpamancers now. 

    + ,
  • One of the things that generative AI will enable is the summarisation of the growing flows of information that we all live in. This is not surprising to the reader of Herbert Simon, who suggested that with a wealth of information comes a poverty of attention and a need to allocate attention efficiently. Now, what it does help us understand is that attention allocation can be achieved in a multitude of ways. The first is to help us focus attention on the right piece of information – this is essentially what a recommendation algorithm does. The second is to focus attention on the right features, and at the right resolution, in an information set. This is what a summarising algorithm does.

    Summaries have been around for ages, most of them produced by people – and so they are not new in themselves. The time it takes to summarise a field has grown, however, and today there are several fields of research and knowledge that are impossible to summarise well before they change in fundamental ways. The limits of summarisation also limit how we can update our knowledge.

    Now, if we believe that the quality of our decisions depend on the quality of the information we draw on when we make those decisions, this should worry us. It seems we could make better decisions if we had access to summaries of different fields as they evolve – at least if it is true that these summaries can be made in such a way that they capture the salient features of the evolving information landscape that are relevant for the decisions that we want to make. Are such summaries possible? Is the tendency to hallucinate that generative AI has a fatal flaw in producing such summaries? This is increasingly the focus of research.

    The paper “Improving Primary Healthcare Workflow Using Extreme Summarization of Scientific Literature Based on Generative AI” proposes a summarisation approach to address the challenge faced by primary care professionals in keeping up-to-date with the latest scientific literature.

    The researchers employed generative artificial intelligence techniques based on large-scale language models to summarise abstracts of scientific papers. The goal was to reduce the cognitive load experienced by practitioners, making it easier for them to stay informed about the latest developments relevant to their work.The study involved 113 university students from Slovenia and the United States, who were divided into three groups. Each group was provided with different types of abstracts (full abstracts, AI-generated short abstracts, or the option to choose between the two) and use cases related to preventive care and behaviour change.

    The findings in the paper suggest that AI-generated summaries can significantly reduce the time needed to review scientific literature. However, the accuracy of knowledge extraction was lower in cases where the full abstract was not available, and this is key — we need to have a sense of what the ideal resolution of the data set used to summarise really is. It seems obvious that summaries made from a set of full papers will be more costly and take longer time, than summaries made from some kind of abstracts. This in turn suggests that we should think about a hierarchy of summaries here, and that there may be an argument for forcing longer abstracts on all papers submitted, so that the abstracts are more legible for AI-models that summarise them!

    The design of summaries will quickly become a key element in how we learn about things.

    In another example the paper titled “Summaries, Highlights, and Action items: Design, implementation and evaluation of an LLM-powered meeting recap system” explores the application of Large Language Models (LLMs) to improve efficiency and effectiveness of online meetings. The researchers designed, implemented, and evaluated a system that uses dialogue summarisation to create meeting recaps, focusing on two types of recap representations: important highlights, and a structured, hierarchical minutes view.

    The study involves seven users in the context of their work meetings and finds that the LLM-based dialogue summarisation shows promise. However, it also uncovers limitations, such as the inability of the system to understand what’s personally relevant to participants, a tendency to miss important details, and the potential for mis-attributions that could affect group dynamics – but does that matter?

    What is interesting is that there may be a quite low threshold for the quality of summaries for many readers (this will vary of course) and so even summaries that have these limitations could easily be valuable if you miss a meeting and no notes were taken. To preserve privacy, we would also have to think through different kinds of limits on attribution here.

    One way to think about summarisation algorithms is to suggest that they compress information – and that raises an interesting question about how much information can be compressed for different purposes. How much can a meeting be compressed? This question quickly turns very funny, since we have all been in meetings that could have been compressed not even to emails, but into the sentence “we don’t know what we are doing, really” – but there is a serious use for it as well: we should look at the information density of our different activities.

    One area that is over-ripe for information compression is email. If you look at your inbox in the morning and imagine that you could summarise it into a few items – how much would then actually be lost? I often find different emails on the same subject, and would love the ability to just ask my inbox to summarise the latest on a subject with views and action items. That would allow my a “project view” of my email and I would be able to step back and track work streams. The fact that email is not already the subject of massive efforts of summarisation and compression is somewhat baffling.

    You could also allow for different summarisation views – summarise on individual, on project or on a topic. Summarise on emotional content — give me all the angry emails first. There are endless opportunities. One example of this idea – in this case to summarise on topic – is found in the paper titled “An End-to-End Workflow using Topic Segmentation and Text Summarisation Methods for Improved Podcast Comprehension” where authors combine topic segmentation and text summarisation techniques to provide a topic-by-topic breakdown, rather than a general summary of the entire podcast.

    Now, you may be tempted to test this by inputting your inbox headlines into a chatbot – but you shouldn’t, remember that your email is probably filled with somewhat sensitive information. But that you may be tempted show something: even just headline summaries would sometimes be helpful – or would highlight how bad we are at writing good subject headings in email.

    Summarisation naturally also carries risks – summaries destroy information, and nuance – and you could imagine second and third order effects to a society that consumes summaries rather than the original thing – the lack of nuance could be disastrous for some classes of decision (legal decision making comes to mind). This is true for all shifts and changes in attention allocation, however – since society is made of attention.

    Recommendation algorithms and summarisation algorithms are just two dimensions here. How we re-design, necessarily re-design, we should say, attention allocation will change society too.

    +
  • In this article Robin Hill suggests something that may seem both obvious and strange at the same time: that our artificial intelligence systems might not cut the world up in the same way we do, or that they may not use the same features to cluster concepts as we do. The example she gives is a dog – we recognize it by looking at the fur, ears, wet nose etc – those are the features we focus on – but why should the machine focus on them? Why should we assume that it divides the world into wholes and parts the way we do?

    This is a deep question, one associated with the often neglected subject of mereology1 The sciences of wholes and parts, see for example https://plato.stanford.edu/entries/mereology/ – a very useful way of thinking about concepts. In mereology we recognise that wholeness of the “dog” and the parts that go into constructing that whole, but we also realise quickly that there are many, many different ways in which we could construct that particular whole. This includes lumping together or slicing things differently across a number of different dimensions, including a temporal dimension. The dog may be made of moments in space.

    We could hypothesize that an all-knowing super-intelligence might actually converge to finding patterns in the particle paths through space-time, and so would recognize entirely different concepts than we do. A clustering of such paths may conjure the concept of “Wegobans” who represent certain particle paths through space time with strong commonalities that we cannot even begin to guess.

    Now, the way we slice and lump (to use the evocative language that Lee Anne Fennell uses in her excellent book Slices & Lumps: Divisions and Aggregation in Law and in Life (2019)) the world is not arbitrary, it is rooted in evolution. Our concepts have evolved for a function – as Ruth Millikan has shown – and so we should expect human mereology to follow from that evolutionary path. But what if that particular mereology is not preserved in the training of large language models? What if that is a feature that is lost in the methods we currently use?

    What would that mean?

    We get into deep philosophy of language territory here – there seems that there is the chance that we end up in false communication patterns, where the symmetry of the way we use the signs convince us that we mean the same, but the structural composition of those signs into wholes and parts are radically different.

    This is something like what Nelson Goodman suggested with his grue/bleen-experiment2 See the “new riddle of induction” here https://en.wikipedia.org/wiki/New_riddle_of_induction , where a concept can be composed in arbitrarily many different ways – or at least along the temporal dimension of the concept3 One interesting question is of course if there are general dimensions along which a mereology is constructed, and if that can be used to explore alternative mereologies?. That is in turn interesting because there will be, in any such faux communication, points where the difference in the composition of the signs lead to drastic break-downs in the ability to convey meaning to each-other. 4 A simple example may be one in which I mean “allergy” to be a medical condition that only applies before the 31st of January 2024 – a far-fetched example, but still.

    And again, it is almost obvious to point out – but the study of the mereologies of artificial intelligences will be an absolutely essential piece of getting both security and safety right. This is slightly different from the study of explainability, where we look just for a mapping of systems to our own mereology so we can translate – and more fundamental: it is the view that we should look for general principles of mereological composition in large language models.

    As we do so there are a number of interesting questions to consider, such as:

    • Do we believe that AI mereologies become more like human mereologies as the size of the training data set grows? Or could the relationship between human / AI mereologies be, say, U-shaped? Why?
    • Are there fundamentals in mereology that have to do with perception, and if so – does this mean that when we add sensors to AI that represent no human sensing capabilities we will end up with vastly different mereologies (cf what it may be to be a bat, as Nagel asks — what is the mereology of a bat – are the commonalities rooted in evolution in some way?)
    • What do mereological safety risks look like and how can they be addressed in the best way?
    • Is shared mereological composition a pre-condition for alignment?

    And so on.

    Notes

    Footnotes and references

    • 1
      The sciences of wholes and parts, see for example https://plato.stanford.edu/entries/mereology/
    • 2
      See the “new riddle of induction” here https://en.wikipedia.org/wiki/New_riddle_of_induction
    • 3
      One interesting question is of course if there are general dimensions along which a mereology is constructed, and if that can be used to explore alternative mereologies?
    • 4
      A simple example may be one in which I mean “allergy” to be a medical condition that only applies before the 31st of January 2024 – a far-fetched example, but still.

    Footnotes and references

    • 1
      The sciences of wholes and parts, see for example https://plato.stanford.edu/entries/mereology/
    • 2
      See the “new riddle of induction” here https://en.wikipedia.org/wiki/New_riddle_of_induction
    • 3
      One interesting question is of course if there are general dimensions along which a mereology is constructed, and if that can be used to explore alternative mereologies?
    • 4
      A simple example may be one in which I mean “allergy” to be a medical condition that only applies before the 31st of January 2024 – a far-fetched example, but still.
    +