The study of failure mode (Mental Models XII)

In John Gall’s peculiar Systemantics: The Systems Bible the reader will find a wealth of often funny but always deep insights. One of these insights that has occupied me lately is this:

The important point is: ANY LARGE SYSTEM IS GOING TO BE OPERATING MOST OF THE TIME IN FAILURE MODE What the System is supposed to be doing when everything is working well is really beside the point, because that happy state is rarely achieved in real life. The truly pertinent question is: How does it work when its components aren’t working well? How does it fail? How well does it work in Failure Mode?

Gall, John Systemantics

Ignore the caps and weird capitalization, the observation here is invaluable. What Gall points out is that failure mode – that is, when a system is not working as intended, but in a non-random way – is the default mode of all large systems.

There are no neat systems

This observation has a few consequences that are of importance whether or not you are looking to solve a manufacturing problem or a political issue.

  • First, always design for failure – that is: when you have decided what the system should do spend at least as much time on how it is likely to fail and then design that failure mode. Looking at a system for social welfare? Expect that it will be gamed – now design it so that it fails in different ways and so is hard to game in a structured way. Building a policy team for a global company to advocate on behalf of the company? Assume that it will be locked into debates internally about what to do and design it to deal with those debates effectively (in any large organization design should be occupied with understanding the particular and enormously interesting failure mode of internal politics). In short: look at what the system is intended to do, assume it will fail and then design the failure.
  • Second, understand other systems in terms of their failure mode. Never become exasperated with a system not working – it won’t start working anytime soon – but adapt to the most recurrent failure mode of that system. How does a bureaucracy fail most often? That is the question that matters if you want to work effectively. Understand and catalogue failure modes, be open about researching them. Look at US politics now – the dream is to return to Athenian democracy (um, but with more, well, democracy). That will never happen. So let’s understand populism as the failure mode it is and assume that we will be working within it. Same thing for polarization and the lack of a common, shared baseline of facts. Assume that we are increasingly organized on the Galefian dimensions of scout minds and soldier minds – since that is a known and common failure mode of enlightenment society. Adapt to failure mode.
  • Third, assume failure modes at least cycle if not evolve. In addition to operating in failure mode, most large systems are degrading and falling a part as well. This means that they are not stable – but they may fall apart in patterns that are at least partly predictable. Design for the fall.

Can we seriously do this? Is this not cynicism? Yes we can and no it is not. It is understanding the nature of complex systems – and understanding that they never work. They fail and that is how they get things done. Now, you can operate with the ideal model of the system or learn to fail with the system to get things done.

