We Gave AI Coding Tools to Non-Developers. Here's What Broke.

The promise was compelling: AI has lowered the barrier to building software so much that you don't need engineers anymore. A company tested that premis. This is what it actually cost them and what every executive needs to understand before running the same experiment.

Somewhere in the last two years, a seductive idea took hold in enterprise boardrooms: if AI can write code, you no longer need people who understand code. The productivity demos are dazzling. The vendor decks promise 10x output. And the cost math looks irresistible when you're staring at fully-loaded engineering salaries.

So the experiment got run. AI coding tools were placed directly in the hands of non-developers. They business analysts, product managers and operations staff asked to build a real software product. Not a prototype. Not a hackathon toy. A production-bound system with actual business requirements and real compute costs attached to it.

The project was sunsetted after hundreds of hours of work and tens of thousands of dollars in compute spend.

Having observed this effort closely, the lessons are worth sharing. This is not to embarrass anyone involved, the effort was genuine and the people were talented but because the same experiment is being quietly run inside dozens of enterprises right now, and most of them don't yet know what the bill will look like.

"AI didn't lower the barrier to building software. It lowered the barrier to producing code. Those are not the same thing."
— David Rizzo

The Setup

The goal was straightforward on paper: deploy a modern AI coding assistant and give a cross-functional team the tools to build an internal workflow application without requiring dedicated engineering headcount. The team was bright, motivated, and had enough technical fluency to navigate the tools. Leadership was enthusiastic. The timeline was aggressive but not unreasonable.

What became clear over time is that building software is not a problem of syntax, it never was. AI has essentially solved syntax. What it hasn't solved is judgment. And without that judgment, the output of AI coding tools in the wrong hands looks like progress right up until it doesn't.

The Three Failure Modes

These weren't one-off bugs or implementation slip-ups. They were structural, repeatable, and, in hindsight, entirely predictable.

Failure Mode 01

No Architectural Foundation

The team started building immediately. That was the problem. Architecture isn't a phase in software development — it's the constraint system that makes every subsequent decision cheaper, faster, and reversible. Without a defined architecture and a coherent architectural strategy, AI becomes a tool for accelerating local decisions with no global awareness. Every prompt optimizes for the prompt. Nobody in the room had the expertise to establish the north star the AI was supposed to be building toward. The result: a collection of working pieces that didn't cohere into a working system.

Failure Mode 02

Inability to Recognize Bad or Unneeded Code

This is the failure mode nobody talks about, and it's the most insidious. The code ran. Tests passed. It looked like progress. But no one on the team had the experience to recognize redundant logic, inefficient queries, unnecessary API calls, or the absence of reusability across modules. Worse: they couldn't identify code that simply shouldn't exist. They were functions built to solve problems the architecture would have prevented. AI is a confident generator. It produces code whether that code is optimal, tolerable, or wasteful. Evaluating the output requires exactly the expertise you thought you were replacing. The technical debt accumulated silently with every sprint.

Failure Mode 03

No Model for Runtime Compute Cost

This was the final reckoning. AI-assisted development made it easy to invoke models, trigger calls, and chain operations in ways that felt natural during development. What nobody modeled was what those patterns cost at runtime, at scale, with real usage volumes. The compute bill grew in ways that were invisible to the team. This was not out of negligence, but because runtime cost modeling is a deeply technical discipline. Understanding token throughput, API call patterns, memory consumption, and inference cost per operation requires the kind of systems-level thinking that doesn't come from business analysis or product management backgrounds. By the time leadership saw the numbers, the gap between build cost and operational viability had made the project untenable.

100s

of hours invested before the project was sunsetted

$10K+

in compute spend before the cost model was understood

What the Experiment Actually Proved

The conclusion isn't that AI coding tools are overhyped. They're extraordinary. Engineering teams using them well are accelerating delivery in ways that would have seemed implausible three years ago.

The conclusion is that AI changes the shape of an engineering team not the need for engineering judgment.

A skilled engineer with AI tools is not 10% more productive. In the right context, they can be an order of magnitude more productive. They can own product thinking and engineering execution simultaneously in ways that previously required two or three people. That's the real leverage point and it only exists when someone in the room can evaluate what the AI is producing.

Remove that judgment layer, and what you get is expensive, confident mediocrity at scale.

"You still need to hire and retain people who know what good looks like. AI changes the shape of your team not the need for judgment."
— David Rizzo

What Leadership Should Take From This

If you're a CEO, COO, or board member who has been shown a productivity deck promising dramatic savings through AI-enabled non-developer development, ask these three questions before you greenlight the experiment:

Who is defining the architecture? If the answer is "the AI will figure it out," stop the project before it starts. Architecture is a human discipline that requires organizational context, constraint awareness, and long-term systems thinking that no current AI tool provides autonomously.

Who is reviewing the code for quality — not just correctness? Running tests confirm that code does what it claims to do. They don't confirm that the code is efficient, maintainable, non-redundant, or worth having at all. That evaluation requires a trained eye.

Who has modeled the runtime cost? Development cost and operational cost are two entirely different numbers. If your team can't produce a defensible compute cost model before you move toward production, you're flying blind on the economics of the thing you're building.

The Bottom Line

The experiment wasn't a failure of people or effort. It was a failure of premise. A premise the AI vendor ecosystem has been happy to encourage because it makes for better marketing than the truth.

The truth is this: AI is the most powerful software development accelerant to emerge in nearly thirty years of enterprise software history. It's also a tool that amplifies whatever judgment is already in the room. Put it in the hands of people with deep technical judgment and you get extraordinary leverage. Remove that judgment layer and you get an expensive, confident mess.

Save yourself the compute bill. Hire the judgment first.

They Gave AI Coding Tools to Non-Developers.Here's What Broke.

The Setup

The Three Failure Modes

What the Experiment Actually Proved

What Leadership Should Take From This

The Bottom Line

They Gave AI Coding Tools to Non-Developers.
Here's What Broke.