Central and Unique Platform teams
Multiple Teams to Rescue
Single Platform / Devops team could easily be a bottleneck, who can we fix that? Multiple teams are the answer. If you have multiple teams, there are several advantages such as:
* Reduce Communications Blast Radius (Team Topologies) you partition your communications.
* Purpose Built Solutions and better services via specialized teams
* More parallelism but with a hidden orchestration cost
* Easier to experiment and move fast
Comunication is complicated with the Single Centralized DevOps/Platform team since all tickets literally go to the same queue. When you have limited resources and a huge queue you end up doing the following:
* Standardize in general solutions (i.g EC2 for all the things).
* You say NO a lot since you dont have the skills neither the time to handle the demand.
* You might be blocking Innovation and being a bottleneck for other teams.
Purpose Built solutions can be archived but simply having more teams, which will allow you to rapidly experiment with new use cases and debottleneck your system. Purpose-built solutions can be better handle by different and smaller platform teams since you will partition your backlog and also could rely on contractors and have a bigger pool temporarily.
More Parallelism: As you have more teams, you are capable to execute more projects at the same time. However, this creates a prioritization / Orchestration issue. Since what teams you should have? What happens if one team depends on another team? The solution could lie in other principles such as:
* Self-Service Model: IF you have services(UIs, Jenkins Jobs, Inner Sourcing model) where anyone can open PRs or use a Jenkins job to create what they need this debottleneck lots of the operations and reduce tickets(manual work).
* Isolation and Autonomy: Another thing you can do is to allow teams to perform their own decisions on the fashion of You Build it You Run it (Amazon/Netflix philosophy). Meaning you dont depend on other teams because you take care of all aspects of the solution given a particular domain problem.
Experimentation it's a nice outcome of all this approach. Most companies don't understand the cloud and don't realize cloud == software and we have a huge engineering pool. The issue is Engineers and sometimes even Engineering managers are sensible with their roadmaps and deadlines and end up being afraid of taking the step forward but it's not as hard as it looks like. Experimentation means different things such as:
* Experiment embracing the DevOps Model and have engineers coding on Terraform.
* Experiment with new delivery models such as More teams, Self-Service, and Isolation.
* Experiment with new technologies, new components and make things move faster.
Experimentation has limits
You can experiment and easily introduce solutions to new Services or new applications. However often the heavy lifting is tackling technical debt and migration of old applications. The inertia in old systems at scale is a hard problem to fix. Experimentation would not save you from that Unless you find ways to have backward compatible APIs/data solutions.
Unfortunately, not all solutions should be Backward Compatible since you could be Backward Compatible with Technical Debts and Design Limitations. Sometimes what you need a big breaking change. The hard part is, depending on how big your solutions are, you might not be able to apply that logic for all your software so prioritization will be needed.
Engineering it's hard and it will always be hard since you have new systems, old systems, technology changes, people coming and going, deadlines, security, and many many other challenges. Team Structure and layout are critical for better or worse evolution.
Cheers,
Diego Pacheco