AI Shift Left

AI might reshape how engineering works; there might be different workflows and different ways of shipping software. We have good and solid foundations in engineering, which are still relevant and can still guide us even in AI agentic times. If we go back more than 30-20 years, the traditional way of building software was highly influenced by RUP. From RUP come phases, roles, responsibilities, and structure. However, there was also waste, handoffs, a lack of focus on coding, and even a waterfall approach. After the agile movement, we learned that software could be built more effectively. Before we even start talking about AI and agents, we need to understand that companies have different levels of maturity and might not be all up to date with previous waves of innovation, such as Lean, Agile, DevOps, Lean Startup, and now AI Agentic Engineering. IF we look at the last almost 30 years of software engineering, we can notice a time called shift left. The idea is to pay more attention to quality before going into production, because mistakes in production are more expensive and can affect customers. Shift Left is a testing concept that has existed since 2001.

Code Review as a Bottleneck

The first noticeable bottleneck is code review. Considering AI Agents like Claude Code, OpenAI Codex, Github Copilot CLI, Gemini CLI, and many others, we can write software much faster. Code review was always a bottleneck, even before AI. But in agentic times, it's even more noticeable. Usually, there are way more people writing code than people reviewing code. Recently, I was blogging about the death of code review (again), and I want to revisit this topic and touch on a few related subjects, like requirements and shift left for AI. Also, recently, I was blogging about people's behavior and environments, which also have everything to do with AI agents.

As I mentioned in the previous blog post and in other past posts, I never liked doing code review with deltas(traditional Github style) because, as an architect, I could never see the big picture, the patterns, the anti-patterns, which are often hidden in deltas. 

I don't want to feed the beast, but if in 2023(the last year without AI disruption, a good vintage year), we would have 20 engineers doing PRs, let's say 2 per week, and we would have 2-5 people reviewing them. Today, the same 20 people might be doing 10-30 PRs a week, and the same 2-5 will be reviewing. So does not matter how much actual productivity AI gave to you or not, the code review is a bottleneck. 

Why not Prompt Requests 



What are prompt requests? This is a recent idea: you just store the prompt, and instead of submitting the code, you submit the prompt. One day, if we actually have real AGI, this could be possible; however, without AGI, I don't think we can do this right now, not for all use cases at least. 

To Prompt Request to work, the LLM models would need to be perfect or very good at one-shot learning, because there is no further "interaction" assistance beyond the engineer. So the reviewer would need to "watch/assist" Claude in applying that prompt, or wouldn't watch what all? What happens if the result is not what was expected? 

We need to remember that LLMs are not deterministic; that English is largely not a context-free grammar, full of ambiguity; and that LLMs are not compilers. By replying to the prompt, you could get something that does not work at all, could require some engineer to assist Claude's code (IMHO, that would easily kill the thing because it would defuse the purpose). 

We are Missing Something

We need to remember that engineering is a system. Claude's code is a system. I'm not talking about software. Agile methods like XP are about systems. Kanban is about systems. System Thinking is:
Systems thinking is a holistic, analytical approach that views complex problems as interconnected systems rather than isolated incidents, focusing on relationships, patterns, and underlying structures

We are obsessed with prompts, and we forget that we have engineers behind Claude's code, typing and fixing Claude's mistakes all day long. That's one of the reasons why a prompt request will not work.  Not everything can be on-shot. 

Does anyone remember TDD? Guess what, it's a system too. 

We have a loop in TDD: first, you write a test, it fails, then you make it pass, you refactor the code, and you repeat the loop. OH, but Claude code fixes all the mistakes he makes... until he doesn't. Do you fully understand the code, the design, the architecture, and the security of what is being generated to criticize? You need to ask yourself what kind of system we are getting into...

IMHO, I see some traits of the system we are "getting into":
  • We are focusing on an execution (would be better to be more strategic, analytical, or even a critical thinker? )
  • We are getting less and less patient (what happens with the problems we never solved yet?)
  • We are increasing our expectations about our time and the outcomes we can deliver (pressure)
Back to TDD, Agile, and even Claude Code, engineering was always and will always be about learning. How fast can we effectively learn? If Claude generated an app in 2h, what did we learned? 

PS: Was that 2h hands off and completely on-shot? No, not really, there was nudging and directing.

Self-Regulating Systems


If the prompt request is not the answer. Perhaps a self-regulating system like Gas Town would be the answer? The idea behind Gas Town is to be the Kubernetes of multi-agents, where multiple roles work together to build software. What's unique about Gas Town is that they are not trying to mimic the traditional waterfall software organization. That could be the answer, but we don't know yet. 

Transformation vs Accumulation

I think AI Shift Left is a safer bet for how we could transform engineering. However, before we go there, we would need to understand the difference between transformation and accumulation. All companies say they want to transform, but very few actually do. What actually happens is accumulation. For instance, companies have departments, and companies love buying tools. Keep those two things in mind. Now, if we go back to the previous movements I was mentioning, like Agile, DevOps, and now AI Agentic Engineering. 

What happened? Agile becomes a department, often close to management, and agile becomes a tool, often JIRA. I know this is not real agile, but this is what companies "digested" as agile. Now look, DevOps, again, DevOps was never a role; it became a team, and the tool? AWS cloud. So companies "accumulated" or absorbed DevOps as a role, department, and some tools. A Lot of people think (and wrongly) that DevOps is about CI/CD. 

Now let's think about AI, the most probable outcome is "accumulation," meaning there will be:

* The AI Team

* The AI Department 

* The AI Tools (today chat GPT, Claude Code, Google Nano Banana Pro for images, ...) 

Companies have an easy time accumulating tools and creating new departments. What companies have a hard time actually doing real transformation, which means: changing org structure, changing roles, changing responsibilities, truly rethinking what they believe and how they work. To do that, you need more than critical thinking, you need effective learning and immense will power and persistence of doing very, very hard things.

AI Shift Left


I think we need to shift left for AI. We need the code review to happen on the engineer's machine. With Claude code, a code-review agent could be triggered by a hook, like a pre-commit hook in git/Github or even a Claude hook, every time you fire a pr (before firing it). A code review agent in Claude's code is just a markdown file.

The same shift left can happen with operations. We can, and we should be using more and more code as policy, but having policies, we can easily generate tests (using AI), and we build self-service platforms that unblock innovation. By implementing good policies and code, we can have Terraform scripts applied hands-off, which would speed things up a great deal.

Instead of "most" of the code review happening in a PR. It could easily be (shift left) and diluted into the engineer's machine on their local Claude code. Code review (as I said in my previous post) can happen off-cycle and doesn't need to be on every PR; it could be every 2 weeks or every 30 days. 

Testing also needs shift left(the original shift left), but there is an AI version of shift left. Because of the advent of AI coding agents, we can have (without much burden) way more testing, with proper diversity, including unit testing, integration testing, chaos testing, stress testing, test induction, property-based testing, snapshot testing, contract testing, mutation testing, and much more. 

Requirements also need to be shifted left. IMHO, there is no such thing as one SDLC. Software development follows a two-phase SDLC (one for discovery and one for delivery). AI, it's a demoralizing discovery, not delivery. Now we can have designers and product folks working closely with architects and engineers for discovery, not for delivery. Maybe now we can realize that requirements were always lies (terrible word requirements - assume it's done, just needs to be clarified), and instead requirements are just an assumption that we will discover and learn if it's true or not. AI coding agents allow designers and product people to see a prototype of the software running very fast; they can use Claude code and experiment, and learn much faster now. 

Monitoring can also be shifted left, we can (using AI) generate much more observability, make sure we expose many more metrics than before, and agents can look at metrics before they go to production and help us understand systems, and that can also be tested. We can test if the system is adding observability, and if it is not, we can halt the deploy until the system gets fixes (maybe by agents). 

Now the biggest shift of all is in its engineering itself. We still don't know yet who the winner is, or what the "new workflow" we will be using will be. 2026 is an interesting year for experimentation and learning. Let's keep hacking and keep learning.  

cheers,

Diego Pacheco

Popular posts from this blog

Cool Retro Terminal

Having fun with Zig Language

C Unit Testing with Check