Code Analysis at Scale

Have you ever thought about how to evaluate software? Outside of the context of Build vs Buy - Sounds like a crazy question, right? Well, the reality is that at scale, you must do more than just refactor everything for the sake of refactoring. Re-writes, Refactorings, and Re-Architect need to be opportunistic and deliver real value to the business. So if it's pretty hard to refactor everything, how to approach software at scale? Some rationalization of the state of affairs is important. Rationalization leads to proper prioritization paired with a good execution can be a powerful tool to deliver change and improve quality. First, the question might sound a bit overwhelming and difficult, but when you start diving deep, you realize what makes sense and what dont, and some shapes emerge. So before I go too deep into this matter, let's start with a simple question - why evaluate software?

Why Evaluate software?

Besides the context of Build vs Buy - There are plenty of reasons to evaluate software  - such as:

  • Understanding - How do things work? What does not work? How could it be better?
  • Prioritization - Should we improve Service A or Service B first? 
  • Decisioning - Should we decommission Service A or Fix it? What's ok? What is not?
  • Strategy - Should we do it? Should we delegate to a company? Should we start a new code base? Should we merge/split code bases? 

Evaluation is important because it brings clarity to the table and gives you data on what's going which is useful in multiple fronts such as:

  • Evangelize your team to make some change
  • Evangelize the rest of the company for a bigger change, often in the form of a project or initiative
  • Avoid efforts on software in mantaince mode (either because the software will be decommissioned, replaced, or simply replaced for some other tool). i.e, PRs for new "Features".
  • Avoid more problems by accelerating some key initiatives - i.e, decommissioning some monolith.

How to evaluate software?

IMHO the question you need to answer is - can we live with this piece of software? This does not mean the piece of software is debt-free, bugs free, or perfect, but it means it's manageable and reasonable. Now, what reasonably might change from company to company. 

People are pretty good at getting used to problems and workaround them. This is a good thing, don't get me wrong, but there is a point this maneuver is too expensive, and few people can do it - I believe this a good sign that we might need help with this particular piece of software. 

At Scale, there is always a problem, there is always software that needs to be re-written, and there is always technical debt, the successful companies have lots of technical debt - and that is fine. That does not mean we need to stop fighting - the opposite is that we need to keep fighting complexity and managing it down. A good way to start is to evaluate what you have in your hands. Evaluation is important because it is the main output to fight complexity alongside talent density and the right engineering culture. 

In order to evaluate software, we need to look at the code; there is no way to evaluate by just talking to people; interviews are interesting and useful but limited to the current status quo and culture. Ok. Read the code - but look for what? Let's consider 3 examples.

Example 1 - Library Migration

Let's say you are planning to rollout an internal common shared library that the whole company uses, and this library is critical; you made some changes to the public APIs and broke the backward compatibility - now, with this scenario in mind, you could look for the following elements in your consumer's code base:

  • How many consumers are we breaking?
  • What are the use cases? How do they use your solution?
  • Do we see any anti-patterns or code abuses? 
  • In the case of breaking changes - can we provide a recipe(steps) to perform the migration or evolve the code for the new version?

Stablished frameworks often have changelogs and migrations guides. For some strange reason, internal shared libraries often do not have that. Reading the code of your consumers can give you the data necessary to plan your migration in the sense of effort and provide the right expectations and callouts up to the front.

Example 2 - Fixing a Classical Monolith and/or Distributed-Monolith 

Now let's say need to build a new feature. Which will require creating new tables and performing a bunch of changes in the current code base you have. It's the perfect opportunity to do it in a different way, maybe a new service, using a new database, or even refactoring the existing one. Now the question you need to ask yourself is - is this the right thing - are we making things better or worse? Again to answer this question, we need to do some code evaluation looking for the following:

  • How much isolation does the code have? Are the new tables isolated from others?
  • Does the code have a service interface in front of the database? It's the DB shared? 
  • What are the upstream and downstream service dependencies?
  • Do we have a clear, simple, and explicit contract for the service? 
  • What about internal shared libs - any abuses or heavy libraries being used? 
  • How the Test coverage looks like? Is any testing covering public contracts?

New services can be good things but do not take them for granted; you could be just making a distributed monolith worse and bigger - so evaluate before moving.

Example 3 - Business Re-ORG

Let's say your business is changing direction from one domain to another. It's common for businesses to pivot the biz model, aquire companies, sell companies, and have multiple businesses which might be raising and sunsetting all the time. All this implies in ownership change - who will be the owner of what? You might start managing a bunch of software you did not manage before - doing some code evaluation is a good idea either to better prioritize the future actions of your team or even to decide what teams should own what piece of software - in that case, we could look to:

  • What is the business frequency of change?
  • How the Test coverage looks like? Is any testing covering public contracts?
  • Do we have isolation on the contracts and databases of the services?
  • How much technical debt do we have? What are the worst offenders?

Evaluations are important because, in the face of opportunities, you will be able to make the most of it - otherwise, you might be doing more of the same - same old same old, and just going with the flow and losing precious opportunities to improve the state of affairs.

Smells, Risks, and Traits

Code and Design smells can help you, they do not mean 100% trouble, and they have different levels of criticality. i.e, Poorly written internal DSL will be better than a lack of database isolation(sharing database with 3 services from different business domains). 

Risks need to be looked in the perspective of trends. The meaning is that things will get better or worse over time. Traits are common behaviors you can expect from software and teams, giving some state in time. For instance:

  • Classical Monoliths tend to be full of technical debt - therefore forcing microservices to happen.
  • The explosion of Microservices done wrong tend to share databases with creating distributed monoliths.
  • Distributed Monoliths tend to get worse over time - being hard to test and hard to evolve. Forcing to think in monoliths again (full circle).

The Value of Inventory

Inventories are important and great for summary findings. No matter if you are doing an internal library migration, fixing monoliths, or team/product re-org. Inventory can be done with a simple sheet or wiki page. Building inventories take a long time, but it pays off. 

How do you know you are fighting the right battles? 

Unfortunately, at Scale, Not all problems can be fixed at once; prioritization is required to be practical and has results. This does not mean giving up or looking to the other side and ignoring problems, but it means having quick wins and building momentum in order to make more improvements.

Going Forward

Always perform evaluations. Knowing what is happening in the code is critical to making good calls and understanding complexity. Anyone can perform code analysis; pretty much all companies today have a github or GitLab public or internal with a web-based interface with some basic search capability. Downloading the source code and reading your IDEA it's also a good idea. 

Reading code is much harder than writing code. Our industry believes Junior Engineers write code and Senior engineers delete code. But IMHO, Senior engineers build inventories and analyze code instead of just deleting it without analysis. 

Cheers,

Diego Pacheco


Popular posts from this blog

Kafka Streams with Java 15

Rust and Java Interoperability

HMAC in Java