|
8/25/2025
When AI Writes CodeIn a follow-up to my last post on AI, Mind the Memory Gap, in this post I dig into AI's promise, pitfalls, and how to make it work for senior engineering leaders and the C-suite. In the past two years, the narrative has shifted from “AI can’t code” to “AI can build apps in minutes.” The challenge is that both statements are wrong in isolation. By, Bill Miller AI is now remarkably capable at producing working software—either on its own or alongside humans—but it’s not a drop-in replacement for developers. For executives overseeing technology projects, understanding what AI can and cannot yet do well is the difference between accelerated delivery and costly missteps. I decided to find out for myself what AI can and can’t do by dreaming up a somewhat ambitious agentic AI–enabled software project. For context, I’ve done a little coding over the years. My formal introduction was a required CS101 course in college, where I learned basic programming in FORTRAN, feeding code into a CDC mainframe via punch cards and getting results on a line printer. There were no CRTs in sight. If that reference means nothing to you, you’re either much younger—or you now have a rough guess at my age. The answer: old. Since then, my “coding career” has been sporadic—system test scripts, small text-processing utilities, a bit of Visual Basic—never anything approaching modern, large-scale software development. Which makes me a good test case: could AI help me build real code (or, more accurately, build it for me)? I used AI chat to describe my vision and get guidance on the build. It suggested setting up an AI coding environment and offered options. I chose Claude Code with Google Cloud Shell Editor, pairing it with the free tiers (or credits) of Google Cloud, LangChain, Neo4j, and Pinecone. The results have been a mix of amazement and frustration. I’m genuinely impressed by how much AI can now do. But I’m equally struck by how difficult it can be to make it do exactly what I want. In my previous post, I described AI’s memory constraints as being like working on a multi-month project where every week a new smart consultant replaces the last one—you have to get them up to speed before they can do anything, then document everything for their replacement before they leave, so progress is slow. With my AI coding helper, the challenge feels even worse: it fixes something, then works on a new feature, then somehow reverts the earlier fix to a broken state, then repeats the cycle. My role becomes part product owner, part tester, part damage control—maybe it’s more like trying to get a smart three-year-old to do what you want. This follow-up to my earlier post on AI’s “memory barrier” looks at what happens when you ask AI to produce real code for real projects—and why success requires more than just telling the AI what you want. The Current State of AI-Generated CodeModern AI models—especially those tuned for coding—can:
Where It Breaks Down Even the best coding assistants can exhibit frustrating patterns: 1. Forgetting Previous Fixes You solve a problem in one round—only to see it reappear after the AI modifies something else. This “regression” often happens because the AI is working from an outdated view of your code, not the current state. 2. Reverting to Simpler (but Less Suitable) Methods You ask for fuzzy, intelligent parsing, but the AI defaults back to rigid rules like regular expressions or hard-coded switch statements. This can happen even in cases where AI-driven methods would clearly be better—such as parsing natural language inputs—because the model has been trained on decades of older, deterministic code examples. In effect, it can behave like an experienced but outdated coder who keeps doing things “the way we’ve always done it,” ignoring newer agentic approaches that could deliver more flexible and accurate results. 3. Overwriting Good Code Some AI tools regenerate entire files instead of applying targeted changes. If their working copy doesn’t include your latest fixes, those fixes disappear. 4. Context Limits in Practice As described in the first article, the AI can only “see” so much of your project at once. In coding, this means it might not recall how an earlier module works—or why you made a certain design decision. Why These Problems Happen From a leadership perspective, think of this as a coordination problem between a very fast, very literal contractor and your team:
Making AI Coding Work in the Real World The good news: these challenges are solvable with the right process and tooling. 1. Always Work Against the Latest State Instruct the AI (or configure your coding tool) to re-read the relevant files before making changes. In human terms: “look before you leap.” 2. Use Targeted Diffs, Not Whole-File Rewrites Ask the AI to output only the changes needed, in a standard diff format. This minimizes the risk of wiping out good code. 3. Add Automated Regression Protection Put tests, type checks, and linters in place. Once a fix passes, lock it in with a test—so if the AI undoes it, you’ll catch it immediately. 4. Be Explicit About Preferred Approaches If you want AI-driven fuzzy matching instead of regex, encode that preference in your prompts or even in a “developer guide” file in your repo. 5. Keep Context Fresh Start new coding sessions for new tasks. Long, meandering interactions can cause the AI to “remember” outdated versions of your code. 6. Balance AI and Human Review AI can get you 80% of the way there fast, but human engineers should validate architecture, performance, and security before shipping. The Executive Takeaway AI coding tools are not yet “self-driving developers.” They are force multipliers: excellent at accelerating routine work, bridging skill gaps, and exploring solutions quickly—but still reliant on process, oversight, and clear boundaries. Used well, they can shorten delivery timelines, reduce backlog, and empower non-traditional contributors. Used poorly, they can create invisible regressions and technical debt that outweigh short-term gains. The real opportunity is in designing workflows where AI and humans each do what they’re best at: AI handles the repetitive, mechanical, or syntactic heavy lifting; humans provide the strategic direction, context-keeping, and quality control that today’s AI still can’t match. Comments are closed.
|