This article digs into what happens when things go wrong at large-scale infrastructure providers. Last month, several well-known companies in this segment suffered widespread outages, and engineering teams later shared postmortems of what went wrong, and what they learned. Of course, many startups never get large enough to operate tens of thousands – never mind millions – of virtual machines (VMs)
