Kanban for DevOps Engineering: From Sense to Predictability

Kanban is a Lean tool. It's focused on the psychological part of the work. In a nutshell, you manage the system via constraints and you don't manage people. I used kanban a lot in the last 8 years. When I started I was a Software Architect | Coach in not-so-traditional SOA projects. 

Today I work a lot with DevOps Engineering and Cloud Computing and I don't work with 2 weeks sprints but with quarters which are 3-months length buffers of work. You might think as a quarter as a super big sprint. There are several different kinds of classes of work like (A) Support for Developers, (B) Tuning, (C) Stress Testing, (D) NoSQL Development, (E) Telemetry and Observability, (F) Distributed Tracking and so on and on. I'm working on CORE team with other architect and DevOps Engineers.

For a long time, I was using kanban metrics and sheets pretty religiously. Because I was working with Big Teams(up to 33 people) and multiple teams who had to deliver on time. As more close you are to the business as more critical meeting dates is. However even in a CORE team DATES are important and sometimes you really can't miss deliveries - which is kind of my moment right now. So What I do? Well, I went back to the roots and add more kanban to my team practices.

Basic Laws of Physics

When you are working on a CORE / Platform team things are a little bit harder since you have fewer people and more CRITICAL and COMPLEX work to deliver. So things you don't know if they work before you some POC and In order to work with complex distributed systems - Science Method and Science Thinking / Approach is really important, especially for Bugs.

So with a Big Spring(3 Months) does not mean you don't deliver software only in 3 months however you are assuming a bigger BATCH than regular microservices. This might sound inefficient and wasteful however there are practical reasons on why work on bigger BATCH most because we are talking about DATA and INFRASTRUCTURE which are things that go SLOWER than microservices, Microservices have a very fast pace.  Data Layer never can go as FAST as Microservices because of all the complexities and STATEFULNESS involved.

Also when you are working with microservices is easy to throw away a release and do an automated rollback and switch back to the previous version also because you might or also should be doing the automated canary analysis. However, even on the DATA side of the force, we work with automation and automated rollbacks / FIXES which is also called Remediation process(will be cover for future posts - stay tuned). Still is slower than microservices.

You might be thinking about LEAN and when I say BIG BATCH might sound as WASTE since Lean is all about reducing batch size. Well I don't mean it :-) Big Batch is for planning point of view for delivery point of view and day-by-day work is all about small batches, PHEW, now we are back to Lean. baby.

Why you don't CUT the quarters then and end up planing small things - Well It depends on some things is possible for others is not. If you think of another Lean Startup Principles called MVP(Minimum Viable Product) so ok I can deliver small more batch but would that batch be usable or add value somehow? That's the big question. Also, there are other complications like RELEASE for DATA need to be perfect BUGS are way more DANGEROUs than in microservices since we are talking about affecting: Availability, Reliability, Consistency of application so really you can't mess with this. So Stability and Stability principles are very important. So Breaking people is really not an option at all.

Multiple Customers and Different Requirements

Another complication that happens when you are working with Data is that you have multiple customers. For Microservices teams you often have a single PO or multiples PO but they do some rotation so often you just have 1 PO per Sprint or Per Service. However for my case when we are talking about DATA, INFRASTRUCTURE than we can multiple customers, often developers in microservices teams.

This makes priorities more complex and more hard to align. Also, one customer might benefit a lot from a feature and other customers don't use it at all. So this is really a more complex place to be. However, you tend to reuse more features between customers and often have more time dealing with some kinds of problems. So like anything in life it's all about trade-offs.

There are people who love work with DevOps Engineering and Architectures, other people might like be on CORE BUSINESS and work more with microservices, I like both, however, today I PREFER the DATA part since the challenges are more interesting to me.

Starting with Sense

So if we get back to Management 101 the basic question we need to answer is... Are we doing good or not? Well, this is a bit more complicated since the variation is often bigger. Tom de Marco on the book Peopleware(It's a classic) already explain that very well. As more variation as you have as more hard will be to be predictive. So if you want to be predictive you need(ideally) have less variation, however, variation is very hard to control in real life.

There are other ways to deal with a variation such as: Using Scrum Planning Poker Points to realize if things are too big or too small and try to remove variation in average over time. Another way to deal with is working with a constant flow of delivery(reducing batch sizes) and then in AVG have items with the same size.

Kanban has a pretty simple and straightforward way to deal with predictability. Before I continue you need to understand that there is no magic if priorities keep changing and new BUGs(lack of stability) keep pilling up or lots of support needs(Lack of better abstraction and documentation or smell of complex solution) you will be dragged away of your coding goals and you won't deliver.

So in order to be focused you need make sure you have proper documentation, effective and easy self-service solutions and STABILITY so you have fewer bugs and can focus on more features. Easy to say not so easy to earn it.

Going back to the Kaban Predictability simple method we need to check things often and know how much we did and how much is pending. This is one of the main reasons why I use the word sense - because it is a "sense" is not something written on stone, it's pretty efficient I need to say however it's not 100% precise.

Kanban Predictability

Basically, you cant how many items you need to deliver, let's say 30 items in a quarter. Them you count how many weeks do you have so let's say we are in the middle of the quarter and you have 10 weeks. Now we can do a simple math:

Number of Items to be Deliver  / Remaining Weeks

So for our simulation will be: 30 / 10 = 3. So the team needs to deliver 3 Items per week. If you check this all weeks you will have the good confidence to know if you will make it or not. So you can negotiate scope or try to get more time or even reduce items. Let's say you don't have an option and need to deliver no matter what(what often happens) then you will need work more hours.

This can be tracked with a very simple Google Sheets. David Anderson Predicted and showed how to do this a long time ago. However 6 years ago most of the people were no doing Microservices, DevOps Engineering and Core Platform, teams. Regardless what you are going some of the principles don't change and are still useful and work.

Diego PAcheco

Popular posts from this blog

Podman in Linux

Java Agents

Manage Work not People