R01 OXD Insight AI App Sprawl Hero V00 KB

When AI makes building cheap, it’s everything else you skipped that costs you

AI coding tools don’t create risk by changing who can ship. They create risk by making it cheap to skip the planning that expensive building used to force. OXD’s Director of Software Development, Steve Ly, on what gets lost when getting it wrong stops being expensive.
Share
FacebookLinkedInEmailCopy Link

When I joined OXD over two decades ago, one of my first projects was an internal invoicing app. We needed to generate invoices for clients, and rather than buy something, we just built it.

That was normal in 2002. But building was expensive, weeks of developer time at minimum, and that expense forced a question before anyone wrote a line of code: is this actually worth it? Over time the answer kept coming back no. Commercial software improved, SaaS took off, and the cost of maintaining homegrown apps outweighed the benefits. We stopped building for internal use and started buying instead.

And now? I’m building internal apps again.

Lately I’ve heard the same thing from our team: when an operational need comes up, the first instinct is no longer to find a SaaS product. It’s to build something. We recently built a tool to manage our software approval process rather than procuring one. We’ve gone full circle, and we’re not an isolated case.

AI coding tools have made building functional software genuinely easy. Lovable and Replit generate full applications from a conversation; Cursor and GitHub Copilot help experienced developers move faster than ever. By early 2025, 25% of startups in Y Combinator’s Winter batch were running on codebases that were 95% AI-generated, and GitHub reports that nearly half of all new code committed today is AI-generated.

Yes, more people can build now. But the more consequential change isn’t talked about as much. When building was expensive, the cost forced you to think first, to scope the problem, question whether you understood the context, consider what happened at the edges, because getting it wrong was costly enough to make the thinking worth it. Cheap building removes that pressure. Not the ability to plan, but the incentive to. And that pressure was doing more work than anyone realized.

Where building beats buying

Off-the-shelf software is built for the median use case, it covers maybe 80% of what you need, and that last 20% gets worked around, duct-taped together, or abandoned. Bespoke internal tools can fit the workflow rather than forcing the workflow to fit the tool.

The economics look good early on, too. And the procurement alternative isn’t painless: evaluation, vendor comparison, legal review, security sign-off, configuration, training. It’s not unusual to spend six months to a year before anyone gets value out of a new tool. We went through exactly that with our own sales software. That timeline alone makes building look like the faster path.

So building can be the right call. The trouble isn’t the decision to build. It’s that the decision now gets made without the scrutiny that used to come built into the price.

What expensive building used to force

When getting it wrong was costly, you planned to avoid the cost. Take that pressure away and planning becomes optional, and the problems it used to catch show up downstream instead. None of them are new. All of them now arrive faster than the friction that used to hold them back.

3 columns of text: Structural debt: Code gets harder to change or refactor over time; Security/compliance: Data exposure, lack of oversight, privacy risks; Continuity risk: Shadow IT, apps orphaned when creators leave.

The code degrades, even in expert hands

This isn’t only a non-developer problem. That’s the clearest sign it’s about incentives, not skill. The code these tools produce tends to be duplicated rather than reused, and patched rather than refactored, meaning restructured to stay clean and changeable as it grows. When the pressure to plan disappears, so does the pressure to restructure, and the code gets harder and riskier to change over time, even when it still works.

The evidence is consistent, and much of it involves professional developers. GitClear analyzed 211 million changed lines of code and found refactoring activity dropped by more than half between 2021 and 2024, while copy-pasted code climbed. A 2025 study of 807 GitHub repositories that adopted Cursor found a 41% increase in code complexity, code measurably harder to safely modify, alongside only temporary velocity gains. The pattern is the same throughout: speed up front, drag later. Alex Turnbull, founder of Groove, estimates that more than 8,000 startups built production apps with AI in 2025 and now need full or partial rebuilds, what he’s called an incoming wave of rescue engineering.

The security exposure is serious

The starkest examples come from the public sector. Ontario’s Auditor General found in a 2026 special report that 60% of AI websites (including assisted coding tools) accessed by Ontario Public Service staff were unsafe or unsecured, and that the Ministry had no controls to stop staff from uploading Ontarians’ personal information to them. Only 3% of OPS staff had completed AI safety training, and it wasn’t mandatory. This wasn’t a rogue actor. It was sanctioned use of AI tools, with no one having planned for how to make that use safe.

It shows up in smaller settings too. Earlier this year a security researcher went to a medical appointment and found the practice had built its own patient management system with a coding agent. Thirty minutes of poking gave him full access to all patient data: unencrypted, on a US server with no data processing agreement, voice recordings sent to external AI services. The builder had no idea what laws they’d likely broken, which is the point. Nothing in the building forced them to ask.

A 2026 ProjectDiscovery survey of 200 enterprise security practitioners found two-thirds of security teams now spend more than half their time manually validating AI-generated code rather than fixing the underlying problems. AI tools don’t write insecure code on purpose. They write code optimized for functionality, not security, and the people deploying it often can’t tell the difference.

Shadow IT—again, but with greater speed

This is the oldest problem of the bunch. Most apps built this way, what the industry has started calling vibecoding, don’t get built by IT. They’re built by people in operations, program delivery, policy, or HR who saw a problem and had a tool that could fix it. That’s citizen development, and it’s genuinely valuable, right up until the app has production data, until the person who built it leaves with the only mental model of how it works, until an auditor asks where the data is flowing.

This pattern already has a name: shadow IT. It’s not new; what’s new is the velocity. When building takes weeks, that friction slows accumulation. When it takes an afternoon, you can end up with dozens of apps in active use before anyone has done an inventory. We’ve written before about clients who’ve gone through exactly this rationalization process, not because the apps were bad, but because nobody knew what existed, what data each one touched, or what would happen if its builder moved on.

For government organizations, the friction AI removes was doing real work. Mandatory supply arrangements, IT security assessments, and accessibility requirements exist for a reason. When teams route around them, the risk doesn’t disappear, it moves somewhere less visible, which is also where the data goes: each app gets its own store, the lineage breaks, and meeting a compliance requirement later means hunting the same data across a dozen places that disagree. That’s a decade of shadow IT at scale, and AI is about to make it accumulate much faster.

Doing the planning on purpose

None of this means stop building. The tools are too good and the speed advantage too real. It means doing deliberately what cost used to do automatically: deciding what’s worth building, and what it needs to account for, before you start.

A scale showing what needs to be accounted for to Buy or to Build

Roughly speaking, build when the workflow is narrow and internal, the data sensitivity is low, and someone is accountable for maintenance. Buy, or escalate to professional development, when the stakes rise: the app handles personal information, it’s customer- or citizen-facing, it could become a system of record, it needs to meet accessibility standards, or it’s wired into critical operations.

For public sector organizations, that threshold arrives sooner than most. An app that touches constituent data falls under PIPEDA or provincial equivalents like Ontario’s FIPPA; one that supports service delivery may owe AODA accessibility compliance; anything requestable under ATIP or municipal freedom-of-information law carries obligations that don’t disappear because the tool was built in an afternoon. Those obligations apply whether or not the builder was a professional, and whether or not they knew, which is exactly why they can’t be left to chance.

One objection is worth answering: better coding tools will keep arriving, and they’ll fix some of this, catching more security holes, writing cleaner code. But a better harness still won’t ask whether an app should touch constituent data, whether FIPPA applies, or what happens to it when its creator leaves. Those are planning questions, not coding questions, and no improvement in code generation answers them. The thing cheap building removed isn’t something a better builder puts back.

What most organizations are missing isn’t the ability to make these calls. It’s the structure to make them consistently, and it doesn’t take much:

  • An inventory of what’s been built and who owns it.
  • Basic rules about where AI-generated apps can and can’t connect.
  • A plan for what happens to an app if the person who built it leaves.
  • A policy on hardcoded credentials.

None of this is expensive or complicated. But it requires someone to decide it’s worth doing before there’s a crisis that makes the decision obvious.

Where this leaves us

Organizations are already accumulating portfolios of AI-generated apps nobody has fully audited, secured, or mapped. That’s not a uniquely AI failure. Careful teams always had to audit, design, and map their systems, and plenty skipped it long before AI. What AI changes is the rate: the unaudited pile now grows faster than anyone is watching it, because the cost that used to slow it down is gone.

The SaaS wave solved many of the problems that came from everyone building their own tools in the early web era. What we’re building now might eventually solve some of the problems of SaaS bloat and procurement drag. But these transitions tend to leave behind a sprawl of half-maintained things, software that was cheap to build and expensive to clean up.

The tools are real and the speed is real. An app can be built in an afternoon that never included the thinking the next several years will demand of it. But software people can actually rely on, that holds up under audit, scales with demand, and doesn’t fall apart when its builder moves on, has always required more than speed. It requires judgment about what to build, how, and what it means to own it after it’s live. Cheap building didn’t make those questions easier. It just made them easier to skip.