On AI: What happens when judgment is cheap but output is free?
How speed, scale, and false inevitability are reshaping the systems we depend on
Somewhere between the last automated checks, a build that had finally gone green, and the quiet urge to just push the change and move on because everything said it was fine—
I stopped.
Nothing was failing. There were no alarms. Earlier flakiness had settled, the dashboards were green across the board, health checks passing, and by any reasonable definition I was ready to promote this to production.
But I couldn’t shake the feeling that something was off.
Not broken. Just… wrong.
That kind of feeling is awkward to justify, because it’s not attached to a specific signal. It doesn’t point cleanly to a log line or a test failure. It’s the sense that the shape of the situation doesn’t quite line up with past experience, even if you can’t yet explain why.
So instead of deploying, I followed my nose, in a manner of speaking.
Eventually I found it: a utility to wait for a port I’d left in place while trying to improve the developer experience in this large-ish codebase, still listening when it had no business doing so in the production runtime. Invisible to the checks I’d run. Harmless in development. But under the wrong conditions, it would have stalled a critical path and taken down a system that, until then, had managed a quiet 100% uptime for a year or so.
Nothing in the tooling caught it. Nothing in the process objected. The only thing that intervened was a vague, unprovable spidey sense that I couldn’t explain and hadn’t had in quite some time.
From the outside, that decision looks irrational. There was no hard evidence. No rule violated. No alert triggered. By every formal measure, proceeding would have been reasonable.
Machines can’t make that kind of call.
AI doesn’t hesitate (at least not without it being artificial). It doesn’t get uneasy about situations that are technically permissible but contextually suspect. It doesn’t accumulate the low-level intuition that comes from watching small, boring mistakes turn into large, expensive failures and wincing at your own stupidity.
By any strict logic, stopping in that moment was absurd.
It was also correct.
And that gap — between what our systems say is acceptable and what experienced judgment says is safe — is where a lot of today’s fragility is sitting.
It’s really hard to predict abnormal failure.
In the past couple of months, nearly every major cloud provider that underpins modern civilisation has stumbled in unusually public ways.
AWS had a significant US East outage. Azure pushed a configuration change that rippled across core services. Google Cloud botched an internal policy update that briefly jolted a wide range of products. Cloudflare spent hours fighting a configuration file that had quietly grown beyond what its own systems could reliably handle.
“We shape our tools and thereafter our tools shape us.”
— Marshall McLuhan (paraphrased)
Outages themselves aren’t new. Distributed systems fail. Regions go dark. PagerDuty starts screaming, SREs recover, ever the unsung heroes, — that’s the cost of scale.
What’s different is the shape of these failures. They’ve been broader, longer, and uncomfortably close together. Less like isolated trips, more like a system wobbling under sustained strain.
These incidents don’t point to a single root cause. They point to an environment where changes are happening faster than people can meaningfully reason about their consequences.
That’s not an AI problem in isolation, but rather a tool-misuse problem at scale.
But all the cool kids* are doing it?!
Earlier this year, Microsoft openly acknowledged that a significant portion of its internal codebase is now written or assisted by AI systems. Amazon, Google, and others have shared similar numbers, often landing somewhere between 20 and 30 percent.
This isn’t documentation or toy scripts. It’s production code. The plumbing that keeps governments, hospitals, banks, and global commerce running.
None of this means an AI model directly caused any specific outage. That’s the wrong level of analysis.
What it does mean is that far more code is being created, modified, and deployed than ever before. Faster than ever before. With fewer pauses along the way to ask basic questions like “does this actually make sense?” or “what happens if this interacts badly with everything else?”
When output accelerates faster than review, tiny mistakes don’t stay tiny. They propagate. Configuration files quietly grow past design assumptions. Policy updates slip through validation. Rollouts move faster than any single human’s ability to notice something subtle but dangerous.
The hammer can always swing faster, but the Windows (pun intended) don’t typically get stronger do they?
*I loathe to call them cool kids, but it felt like the right “vibe” for this context.
Checkers, chess, or angry birds?

Very recently, Microsoft’s own President of Windows and Devices, Pavan Davuluri, has been unusually candid about Windows 11 suffering from sustained churn — with core areas being changed and reworked faster than the organisation could stabilise them. I don’t think it’s because the engineers suddenly forgot how to build software, but because the complexity, surface area, and rate of change have outpaced the organisation’s ability to stabilise what it ships.
This is a company that has been building operating systems for decades. A company with extraordinary engineering depth. If they are struggling to keep a flagship consumer OS coherent while simultaneously accelerating AI-assisted development, that’s not a moral failing.
It’s a systems signal.
When the people closest to the machinery start admitting they can’t fully reason about the whole thing anymore, we should sit up, and pay attention.
The same pattern, one layer up
Regrettably the plumbing isn’t the only part of the internet showing signs of strain. The public web is visibly drowning in what can only be described as a torrential onslaught of slop.
Independent analyses over the last year estimate that roughly 30 to 50 percent of new English-language web content now shows clear AI fingerprints. In SEO farms and low-value content mills, the number is much higher.
This is why so much of the internet now feels strangely hollow. Like it was written by something that understands the shape of language but not its purpose. I often picture it as a farmers-market-esque row of puppet shows, where a bunch of puppets try to convince each other that they too, are real boys.
The pattern unfortunately also mirrors what’s happening in codebases everywhere. More output. Less friction. Less scrutiny. Different layer, same dynamics.
Again, this isn’t because AI is malicious — AI doesn’t think or feel, have desires or motives, or know anything beyond statistical next-token prediction right now — it’s because a very efficient hammer is being used everywhere simply because it can be.
The reliability gap
Cloud platforms were originally designed for a slower era. Engineers reviewed the changes that mattered, often two very senior keys would need to turn before the launch. Deployments happened in controlled windows. Someone signed off because the stakes were obvious.
Now everything ships constantly. Automatically. And a growing share of those changes originate from systems that do not possess intuition, caution, or doubt.
Machines are excellent at speed and volume. They are very bad at having the intuition to notice when something feels “off”.
This doesn’t prove that AI causes failures. It does, however, explain why failures now feel broader, stranger, and harder to predict. More changes, in more places, by more automated systems means the internet is moving faster than the safety rails built to protect it.
You can’t automate away the laws of nature.
Underneath all of this is a constraint you can’t just magically defer or automate away: hardware.
High-bandwidth memory is one of the tightest choke points in the global technology supply chain. The same HBM stacks are required for data-centre GPUs, AI accelerators, advanced networking gear, and increasingly high-end consumer devices. Expanding capacity takes years, not quarters.

At the same time, companies like Nvidia have been clear about where their priorities lie. Data-centre and AI accelerators carry far higher margins than consumer hardware, and production capacity is being redirected accordingly through 2025 and into 2026.
Reliability depends on slack, spare capacity; the ability to absorb mistakes quietly.
Scarcity removes slack. Systems run hotter, closer to their limits, and far less forgiving of error. When software velocity accelerates while physical headroom disappears, failures stop being edge cases and start looking systemic.
It’s also important to be clear about where AI genuinely shines.
Some of the most meaningful scientific advances of the last decade have come from exactly these systems. AlphaFold didn’t “move fast and break things.” It operated inside tight constraints, with clear ground truth, strong feedback loops, and rigorous evaluation against reality.
The same pattern shows up in drug discovery, materials science, medical imaging, and climate modelling. Domains where mistakes are visible, iteration is grounded, and outputs are tested against the world rather than quietly merged into production.
In those contexts, AI doesn’t replace judgment — it amplifies it.
The pattern is consistent. These successes happen where constraints are clear, feedback loops are tight, and outputs are evaluated against reality rather than quietly merged and deployed.
When AI amplifies careful human work, it’s extraordinary.
When it replaces judgment in complex, tightly coupled systems, it becomes dangerous.
That said, it’s worth not beating around the bush here: the parts of the system that still work are the ones where human judgment hasn’t been automated away.
Teams that move a little slower on critical paths. Engineers who stop a deployment because something feels wrong, even when every dashboard says otherwise. Organisations that treat review, pause, and restraint as features rather than inefficiencies.
These are adaptive practices, not necessarily nostalgic ones.
The irony is that AI makes this more important, not less. As output gets cheaper and faster, judgment becomes the scarce resource. The bottleneck shifts from how much we can produce to how carefully we decide what’s worth building — and how it will actually be experienced by the people on the other side.
Systems that recognise this — that deliberately preserve human checkpoints where the cost of failure is nonlinear — don’t look slow, but look resilient instead.
The illusion of inevitability
Despite the marketing, large language models are not thinking machines. They are extremely sophisticated prediction engines. Next word. Next token. That’s it.
The newer “reasoning” models add structure and repetition, breaking problems into steps and running the model multiple times. From the outside, this looks a lot like thought. Underneath, it’s still statistical autocomplete with better choreography.
These systems don’t know when they’re wrong. They don’t hesitate. They don’t experience the moment of doubt that stops a human from pushing a risky change live.
That distinction matters most when urgency removes the option to slow down.
The real danger isn’t AI itself. It’s the story we’re telling ourselves that using it everywhere, immediately, is unavoidable. That if we don’t swing the hammer now, someone else will.
We’ve seen this movie before, haven’t we?
Yes, to be fair. This is how arms races form. Companies and countries race each other, promising to litigate and clarify policy after the fact, but capability almost always compounds faster than governance, and any caution gets reframed as weakness.
We’ve lived through this dynamic before with nuclear weapons. Unprecedented power, catastrophic downside, and enormous incentives to move first.
We survived not because nukes turned out to be harmless, but because we eventually agreed that some technologies are too dangerous to optimise blindly. Treaties. Inspections. Red lines. Shared constraints, even among rivals who trusted each other very little.
It wasn’t perfect, but it dramatically reduced the probability of irreversible harm.
So what now?
This isn’t a call to abandon AI. It’s a call to stop pretending that acceleration is always a virtue and is inevitable.
But we need to slow down where judgment still matters. We need regulation that meaningfully constrains deployment in high-risk domains. We need international agreements that reduce the incentive to race blindly toward capability without control.
The fragility we’re seeing across the internet today isn’t the catastrophe, it’s the warning. It’s what happens when powerful tools outrun the wisdom required to use them well.
The good news is that this fragility is the result of choices we’re actively making. We’ve always had a choice — heck, we still do. Which means we can make different ones.
The alternative is accepting a small but growing probability of irreversible harm and calling it progress.
That feels like a bet we don’t actually need to take.



