December 12, 2025   âśś   6 minutes In late 2022, I had a conversation with a senior engineer on the coming problem of “what to do when AI is writing most of the code”. His opinion, which I found striking at the time, was that engineers would transition from writing mostly “implementation” code, to mostly writing tests and specifications. I remember thinking at the time that this was prescient. With three years of hindsight, it seems like things are trending in a different direction. I thought that the reason that testing and specifications would be useful was that AI agents would be struggling to “grok” coding for quite some time, and that you’d need to have robust specifications such that they could stumble toward correctness. In reality, AI written tests were one of the first tasks I felt comfortable delegating. Unit tests are squarely in-distribution for what the models have seen on all public open source code. There’s a lot of unit tests in open source code, and they follow predictable patterns. I’d expect that the variance of implementation code – and the requirement for out-of-distribution patterns – is much higher than testing code. The result is that models are now quite good at translating English descriptions into quite crisp test cases. System Design There exists a higher level problem of holistic system behavior verification, though. Let’s take a quick diversion into systems design to see why. System design happens on multiple scales. You want systems to be robust – both in their runtime, and their ability to iteratively evolve. This nudges towards decomposing systems into distinct components, each of which can be internally complicated but exposes a firm interface boundary that allows you to abstract over this internal complexity. If we design things well, we can swap out parts of our system without disrupting other parts or harming the top-level description of what the system does. We can also perform top-down changes iteratively – addin...
First seen: 2025-12-13 04:50
Last seen: 2025-12-13 13:51