column "It's always DNS" is a long-standing sysadmin saw, and with good reason: a disproportionate number of outages are at their heart DNS issues. And so today, as AWS is still repairing its downed cloud as this article goes to press, it becomes clear that the culprit is once again DNS. But if you or I know this, AWS certainly does. And so, a quiet suspicion starts to circulate: where have the senior AWS engineers who've been to this dance before gone? And the answer increasingly is that they've left the building — taking decades of hard-won institutional knowledge about how AWS's systems work at scale right along with them. What happened? AWS reports that on October 20, at 12:11 AM PDT, it began investigating “increased error rates and latencies for multiple AWS services in the US-EAST-1 Region.” About an hour later, at 1:26 AM, the company confirmed “significant error rates for requests made to the DynamoDB endpoint” in that region. By 2:01 AM, engineers had identified DNS resolution of the DynamoDB API endpoint for US-EAST-1 as the likely root cause, which led to cascading failures for most other things in that region. DynamoDB is a "foundational service" upon which a whole mess of other AWS services rely, so the blast radius for an outage touching this thing can be huge. As a result, much of the internet stopped working: banking, gaming, social media, government services, buying things I don't need on Amazon.com itself, etc. AWS has given increasing levels of detail, as is their tradition, when outages strike, and as new information comes to light. Reading through it, one really gets the sense that it took them 75 minutes to go from "things are breaking" to "we've narrowed it down to a single service endpoint, but are still researching," which is something of a bitter pill to swallow. To be clear: I've seen zero signs that this stems from a lack of transparency, and every indication that they legitimately did not know what was breaking for a patently absurd len...
First seen: 2025-10-20 22:07
Last seen: 2025-10-21 15:10