What AI copilots can - and cannot - do for engineering organisations

Over the last year I've led the embedding of generative-AI tooling across the software development lifecycle in a large engineering organisation: code-review assistants, automated test generation, documentation generation. The individual tools are not the interesting part - vendors ship and iterate those constantly. The interesting part is what happened to the organisation.
What worked
Documentation generation is the clearest win, and the one most teams undervalue. It turns out that a lot of the friction in large engineering organisations is not that code is hard to read - it's that context around the code is scattered and out of date. Copilots close that gap faster than any human documentation initiative I've ever seen work.
Test generation helped most where test coverage was already a culture, and helped least where it wasn't. This surprised me initially. It shouldn't have. A team that doesn't value tests isn't going to value the ones a model writes either.
What didn't
Pure velocity metrics - commits per engineer, story points closed - moved in ways that felt both real and meaningless. The real thing that changed was not how fast individual engineers wrote code. It was how much of the bad first draft survived into the repository, and how quickly the team converged on a shared style.
Do not measure the copilot. Measure the organisation it lives inside.
The leadership shift
The biggest shift is in the role of the engineering leader. You spend less time optimising individual throughput and more time designing the environment the copilots live in: the code review conventions, the testing culture, the documentation expectations. The copilot is a force multiplier, and you are increasingly a designer of the forces.



Discussion