The Guardian view on the internet outage: we need resilience, not just efficiency

一世n Rumaan Alam’s apocalyptic novel 把世界抛在身后, a protagonist notices a news alert warning of a major blackout: “She jabbed at it, but the application did not open, just the white screen of the thinking machine. This was a specific flavor of irritation.” To the character, Amanda, it is merely an inconvenience. But the reader already understands that something is very wrong. The sudden severance of communications is ominous.

The novel gained an eerie sense of prescience by being published in the middle of the pandemic. 相似地, Tuesday’s massive internet outage, which saw news and other hugely popular sites around the world vanish, was perhaps more disconcerting for a public primed by Covid to the abrupt disruption of things we took for granted. 守护者, 纽约时报, the BBC, the Financial Times, CNN and Le Monde were all hit, along with internet behemoth Amazon, the site in Britain, PayPal and Reddit. What or who was responsible?

The initial anxiety fell away rapidly when the problem was identified as originating at a content delivery network run by Fastly, which resolved it in just under an hour. (Its service hosts content close to where it is likely to be requested, helping websites load faster for users.) The outage raises questions about the consolidation of internet infrastructure. For individual users, relying on one of a handful of large, well-established players doubtless makes more sense than picking from a wider, motley selection. But for the sector as a whole, it concentrates risk, so that one small problem – in this case an unspecified “service configuration” – can lead to vast outages.

Societies have yet to get to grips with these kinds of issues. Rhetorically, we may talk of new critical infrastructure. But we are not regulating accordingly. The outage was a reminder that we are increasingly dependent on services that most of us neither understand nor control, and which in many cases expose us to new risks – whether from contingency or bad actors. Just-in-time delivery saves supermarkets money, but can quickly lead to empty shelves if demand suddenly surges or supply is interrupted unexpectedly. 上个月, a cyber-attack forced the operator of the US’s largest fuel pipeline to shut down. 二月里, a hacker tried to poison the water supply of a city in Florida.

The outage, like the pandemic, should remind governments, companies and citizens of our vulnerabilities and the need to design with not just the slickest outcomes but also the worst-case scenarios in mind. A more sophisticated approach might see organisations compiling “complexity registers”, looking not simply at the likelihood and impact of events occurring (as risk registers do), but also at their broader implications and the risk that they could trigger a series of crises. We should focus not only on ensuring that our systems don’t fail, but that we know how to recover if they do. The market will inevitably prioritise lean operations. 然而, wiser companies, as well as regulators, should realise that building in redundancy may be inefficient in the short term but is often necessary. And as well as understanding the systems on which we depend and developing adequate plans for potential problems, we must make sure we heed those plans when the time comes.

The danger is that these kinds of events could instead desensitise us to technical or organisational failures, leading us to assume that we will always muddle through. We often do. We cannot count on it.