All posts
27 JANUARY 2026 · · 6 MIN

The quiet moment when agentic AI starts working

The quiet moment when agentic AI starts working
I don't get excited about new AI tools anymore. Most of them are incremental, even when the marketing isn't. But a recent open-source agent release produced the specific feeling I had the first time I used ChatGPT - the quiet moment of 'okay, this changes things'. That feeling is rare. It's worth understanding when it actually means something and when it's just the marketing doing its job.

ChatGPT's arrival, three years ago, produced the feeling for a specific reason. It crossed a capability threshold from 'tools that help me do my work' to 'tools that do the work with me as the orchestrator'. The shift wasn't about how fluent the model was, although it was fluent. It was about the fact that conversational interaction with a machine suddenly felt natural. That had never been true of any prior system, and once you'd experienced it, your mental model of what software could do updated quietly and permanently.

What the new threshold looks like

The shift the agentic release points to is a different threshold. ChatGPT changed how we talk to machines. The agentic wave points to machines that actually do work on the user's behalf, over multiple steps, against real tools. The interaction model is no longer 'I ask, it answers'. It's 'I describe a task, it plans, calls APIs, checks its work, and returns a result'. That's a categorically different pattern.

The specific thing that produces the quiet moment is watching the system do a task you'd normally do yourself - not perfectly, not always, but credibly enough that you'd trust it with a small version of the job. The first time this happens for any individual user, the mental model updates. You can't argue someone into this shift. They have to feel it.

The open-source angle

The interesting additional variable is that the most striking recent release is open-weights. This matters more than most of the consumer coverage acknowledges, for operational reasons that are rarely glamorous. An open-weights agent can be audited, self-hosted, governed by your organisation's own controls, and deployed in environments that can't accept a closed-frontier dependency. Several markets have been waiting for exactly this - regulated industries, government, enterprises in jurisdictions where API-based AI deployments are legally fraught.

The closed frontier models have been producing capability advances that a large fraction of the market couldn't actually adopt. The open-weights equivalent unblocks adoption specifically for the segment that had been gated by governance concerns, not capability concerns. This is not a democratisation story in the naive sense. It's a market-segmentation story. The base capability being open changes who can build on it, which changes the competitive dynamics of the agent-tooling layer above it.

The skeptical read

Every major AI release for the last three years has been preceded by 'this changes everything' announcements and followed, six months later, by 'this is useful for specific cases with known limitations'. The pattern is reliable enough that any single excited reaction should be discounted heavily. The fact that something feels like a capability threshold is not evidence that it is one. I'm aware of this bias in my own reading and trying to correct for it.

The specific reason I trust the feeling this time is that the capability being demonstrated - multi-step task execution against real tools with meaningful reliability - is the category we've been waiting on for two years. Every agent framework before this one was somewhere on the spectrum between 'promising demo' and 'unreliable in production'. The numbers coming out of teams running this in production are the first I've seen that are within range of usable. That's the specific signal I'd trust more than any single user's excitement.

You can't argue someone into this shift. They have to feel it.

What changes operationally

If the capability threshold holds, the thing that changes is the unit of work. Today, a product team builds features. Each feature is a bounded piece of functionality shipped in a sprint cycle. In a world where agents reliably execute multi-step tasks, features become smaller and workflows become the unit. You don't build 'a form to file expenses'. You build 'an agent that monitors your spend and files expenses for you', which is a categorically different product, and the team structure, PM roles, and design pattern all shift to match.

Most organisations are not ready for this reshaping, and the ones that move first will have an advantage that compounds. The teams that are already experimenting with agent-first workflows in low-risk settings are building muscle that will matter when the capability threshold is universally acknowledged. The teams that are waiting for certainty will arrive to the party late and buy the same capability at a higher price with less experience using it.

What to do this quarter

Three concrete suggestions for anyone in an engineering leadership seat. First, give two of your smartest engineers one week each to build something with a current open-weights agent framework. Not for production - for calibration. You need their mental model updated before the org-level decisions hit your desk. Second, start tracking the cost curve. Agentic workloads consume tokens at rates that dwarf chat interactions, and the financial shape of your AI spend is going to change. Third, name a specific person as the owner of agent governance. Incident response, access control, tool-use audit. This role doesn't exist at most organisations yet, and it will within eighteen months.

The feeling I had with the recent release is the same feeling I had with early ChatGPT, and I've learned to take those feelings seriously while remaining calibrated about the specifics. The capability is real. The timeline is slightly longer than the excitement suggests. The organisations that spend the next six months building familiarity and governance are the ones that will be positioned when the capability becomes routine. That routine state is closer than most of the market is pricing.

← Previous
The person being referred is the one you want to hear from
Next →
Don't build a business on AI getting worse

Discussion

Email used only for your avatar. Never shown, never stored in plain text.