The Death of Code Review (Again)

Almost 3 years ago, in 2023, I wrote a blog post about the death of code review. It hasn't been that long, but software engineering has changed a lot in the last 2 years, and there is a tendency to change even more in 2026. Back in 2023, I was not even talking about AI being the "killer" of code review; it was a series of things: people not paying attention, LGTM without any effort, and a lack of prioritization. So we need to understand that code review was always an important practice, but even without AI, it was already in decline, and people were doing it wrong. Back in 2026, there were even stronger forces pushing code review to die or to change significantly. Most engineers dislike doing code review. I do like code reviews. I found code review useful and an important tool to enforce consistency. However, the disruption with code reviews is inevitable... 

Why do we need Code Review?

Code review has many roles in modern software engineering, to quote a few:

  • Quality: It's a peer review process (an engineering (A) review of the code of the engineer (B)), which allows an increase in quality (Built-in quality, a lean principle).
  • Consistency: Consistency is established by following a uniform architecture vision, common standards, team agreements, and can be a collective call or centralized by an architect.
  • Confidence: Code review helps increase our confidence that we can release software, and we will not break the system, service, app, or whatever we are building.
IF code review is done well, it's very useful because it can uncover bugs, design flaws, testing gaps, observability gaps, missing logic, and missing business requirements. Good code review and bad code review are almost indistinguishable. Because good code review ends with the words "LGTM" and bad code review also ends with "LGTM". So, how do you make a difference between good and bad code review? 

It's all about the review process. Tell me what the reviewers are doing and how much they are enforcing the right things or not, and I tell you if it is a good code review process or not. Like I mentioned in my previous post from 2023, a good code reviewer would check for:
  • Look for Architecture / Design Gaps
  • Look for missing functionality or business requirements gaps
  • Look for corner cases
  • Look for missing Observability
  • Look for poor error handling
  • Look for poor coding practices and anti-patterns
  • Look for missing tests or not enough testing diversity
Now. Code review depends on the team or the person, so it can widely change. When you have an ownership erosion issue (post I wrote in 2021) , people might not really care, and that is the responsibility of the committer to deal with a production bug. Team dynamics can change code review dynamics significantly. 

Code Review is a process of alignment and vision shaping. I like a lot the practice of having the architect as the sole merger. Multiple people can review, but just one or two should be able to merge, which allows conceptual integrity happen. If everybody can merge, it's hard to guarantee integrity. Integrity of Design, Architecture, concept, and code structure. 

If you do this process right, and you team is good, the less and less you catch things, because the team learns a common way, style of doing things, then you still can catch things, but you tend to catch less and less, the more aligned a team is, the fewer things they catch in a code review... 

Code review also works less effectively because, unfortunately, people are working on their own branches.

Branches and Fake CI

Since 2009, I've been a strong advocate against branches. Branches are bad because they kill real continuous integration. I talked about that in another post from 2023: The Death of CI/CD. The logic is very simple: if all engineers are working on separate branches, then when you run a CI job in Jenkins, your code is not there because it has not merged into the develop or main yet.  Like I mentioned in the 2023 post, release trains are also a very bad practice. Only when the release cycle happens does the code get fully merged and integrated - thats where all the problems appear. 

Problems get to go under the rug and get secretly hidden until you try to release, and the release is problematic and never goes well, why? Because of the branches and the lack of real CI. That effect makes code review weaker as well because you are not reviewing the whole story only small bits at a time, and I wrote about that as well in a 2022 post: Beyond Code Deltas. Where I always preferred to do the code review out of the cycle and not related to a Pull Request, but instead reading the whole code base so I can grasp the big picture. 

AI Disruption Force: From Copilot to Coding Agents

Back to 2026. AI is disrupting software engineering like we never seen before. When Copilot happened was a great innovation, but it did not disrupt the software engineering process too much; it was just a better tool, a better autocomplete, that saved us from doing searches in Google or StackOverflow.

However, since the rise of coding agents like Claude Code, Codex, Gemini CLI, Copilot CLI, and many others, software engineering has started a much deeper disruption process. Because engineers now spend much less time in traditional IDEs like IntelliJ and VSCode. In this post, De-Risking, I explained that Claude code is the new IDE; it's where you spend all or most of your time now. Claude Code is not a traditional IDE nor a code editor, but I'm pretty sure you get my point.

Coding Agents significantly speed up the engineering process. For instance, Boris Cherny, the creator of Claude Code, is doing 50-100 PRs per week, which is a lot. He also shared that he is using Claude Code to build Claude Code. That's impressive, but we need to remember that engineering tools and infrastructure solutions don't require as much from product/UX discovery as commercial or consumer software. 

Claude Code is an engineering solution made by an engineer for engineers, using AI, of course, but it's not a baking app, it's not an e-commerce, it's not Netflix, it's not sausage factory management software, it's not a health care system. All these software programs I mentioned are fundamentally different. Because:

  • The consumers are not only engineers (yes, engineers can see Netflix)
  • You must have a much stronger UX structure
  • You must have Business Analysts / Product Managers
  • You have a regulation or multiple regulations, depending on the industry, like Health Care.
  • You have Legal and Public Relations concerns 
  • There is a real need for the involvement of many more people
All that needs to be taken into account. It's impressive to see 100s PRs a week for Claude's code, but this is not 100% translated 1 to 1 to all industries and to all realities. Sure, we will see improvements, but it's not the same thing. 

I will explore this further in a different blog post, but IF we don't measure things end-2-end it's pretty hard to tell if we had real improvements; that's a Lean principle. Because if you develop 5x faster now, but you release software at the same speed as before, nothing really changes, and the benefits are not real. 

Now, Claude's code is not the only force; there are others like Ralph Loops and Gas Town. Which I will also be covering in other blog posts. But the point of these new approaches is that, with multi-agent systems, we would go even faster. Like I said, really faster is a Lean question more precisely into LEAD time, so cycle time alone does not impress me, but we will see :-) 

Gas Town is bonkers. As the author says, it's the Kubernetes of multi-agent systems. The author is claiming to have maxed out 3 Claude Max subscriptions in a matter of days. So Gas Town and even Ralph Loops can significantly increase costs, and especially in Gas Town, we see more factors that contribute to the death of code review. IMHO, Gas Town is not production-ready right now, but it's an interesting idea to keep an eye on. So, the Gas Town has an agent just to deal with merges. 

IF we can code at the speed of light with agents and multi-agent systems (even faster), what is the next bottleneck? Well, it's the code review. There is another pressure point, where teams are getting way more PRs than ever, so the pressure in the review queue is huge. So what people do, and might do, and are already doing it:
  • Don't pay much attention and just LGTM
  • Get more people to help in code reviews
  • Create or use a code review agent like Greptile, Code Rabbit , or Github Copilot Code Review 
  • Code Review agent could also be a sub-agent or a custom command which is just a markdown file in Claude code (local folder in your machine).
  • Find other ways to increase quality and depend less on code review
  • Keep doing what we always do (but there will be a bottleneck)
IMHO, the reality of each team will be different; the more critical something is, the slower it will be. So we would need to understand the nature of the team or the nature of the project. For instance, consider the criticality of the team/project:
  • low: just AI
  • medium: AI and humans sometimes (maybe a sampling like 1/5 PRs)
  • high: AI coding Agents + Humans 
So why not never ever look at the code again? That would be the ultimate bottleneck being removed, right? Therefore, the true ultimate death of code review, right? Why not?
  • Some projects cannot fail under any circumstances (critical business rules, for instance)
  • How can you tell the Architecture and the Design are right? (you need to review the code, maybe not every delta, but 1x per month?)
  • Security (we know LLMs suck at security, we can't ignore that, so for security reasons, we need to look at what the code is doing - but it could be a scanner or an agent helping, but still, we would need to read)
See, turns out the code wasn't everything; there is more in the code than some people tought... However, we could still have stronger guardrails and compensating controls, which would compensate for less code review... which leads us back to guardrails.

More powerful Guardrails

Code Review is a manual process. Everything that is manual is error-prone. Software engineering is all about reliability and consistency. LLMs are not reliable because they are slot machines. However, engineering is reliable. So, we can add reliable guardrails, which would serve as compensating controls for less code review, for instance, consider:

  • Increasing Testing Coverage
  • Increasing Testing Diversity (Unit Test, Integration Tests, Chaos Testing, Stress, etc...)
  • Having more comprehensive linters in the case of TypeScript
  • Leveraging strongly typed languages like Scala 3 and Rust.
  • Having better observability on the Code
  • Leveraging Containers, K8s, and progressive rollout patterns, split traffic
  • Beta Users Programs
  • Code Review out of the delta(PR cycles) - maybe 1x per month?
  • Leveraging Code as Policy and having more automated checks in the infrastructure on Terraform, K8s, AWS Resources, and everything you can use code to enforce policies you do.
  • Real CI/CD with small deltas and constant deploys (not constant releases)
IMHO, these guardrails are powerful and very reliable, therefore they would allow us to have less delta difference for some projects and still faster with confidence.

Signals

A good system has signals that can tell us what's going on. It's important to have LEAD time metrics, but we also need other signals, and thats how we tell things are okay or not. Here are some examples of signals:

  • Number of incidents in production
  • Number of bugs in production
  • Number of support calls
  • Number of comments (bad ones) at Apple and Google stores
  • Site Traffic
  • Revenue 
Such signals can also be called metrics or observability. Having overall signals is great for the company's overall status; however, we need feature observability. IF we have metrics for all the features that we release, we can know what's going on. Observability is another compensating control; it does not prevent bad experiences for users, but it saves future users from them. Observability combined with split traffic and rolling update patterns allows us to reduce the blast radius for users, and thats mitigates bad user experience for all users (only a few would suffer). 

How to make it better?

It's hard to say how software engineering will be in 2, 5, or even 10 years in the future, but here are somethings we can do that will help:

  • Add more guardrails
    • Increasing Testing Coverage 
    • Increasing Testing Diversity (Unit Test, Integration Tests, Chaos Testing, Stress, etc...)
    • Having more comprehensive linters in the case of TypeScript
    • Leveraging strongly typed languages like Scala 3 and Rust.
    • Having better observability on the Code
    • Leveraging Containers, K8s, and progressive rollout patterns, split traffic
    • Beta Users Programs
    • Code Review out of the delta(PR cycles) - maybe 1x per month?
    • Leveraging Code as Policy and having more automated checks in the infrastructure on Terraform, K8s, AWS Resources, and everything you can use code to enforce policies you do.
    • Real CI/CD with small deltas and constant deploys (not constant releases)
  • Consider critically whether to go with more or fewer reviews:
    • low: just AI
    • medium: AI and humans sometimes (maybe a sampling like 1/5 PRs)
    • high: AI coding Agents + Humans 
  • Consider doing code reviews outside of PR cycles (like 1x per month)
  • Add proper observability with the right signals, like:
    • Number of incidents in production
    • Number of bugs in production
    • Number of support calls
    • Number of comments (bad ones) at Apple and Google stores
    • Site Traffic
    • Revenue 
  • Evaluate code review agents like: GreptileCode Rabbit , or Github Copilot Code Review , but still review outside of the PR cycles.
  • Understand that if engineering can produce code faster via agents, we can also fix problems faster with the same agents; bugs or bad behavior would not take long to be noticed, considering proper testing. 
We are living and experiencing the disruption AI is doing over the software engineering process and practices. Things will change, keep open, experimenting with a LAB mindset that will allow you to experiment and learn rather than make final decisions that cannot be undone. Always make sure you can undo what you are doing...

Cheers,

Diego Pacheco

Popular posts from this blog

Cool Retro Terminal

C Unit Testing with Check

Having fun with Zig Language