Posts

Showing posts from February, 2026

Agent Skill in Multi-Agent Systems

Image
People building agents today are mostly doing one-shot. Meaning they write one and that's it. Yesterday, I was watching the YC Lightcone podcast: "Inside Claude Code With Its Creator Boris Cherny" and one of the things Boris, creator of Claude Code and head of Claude Code in anthropic, said is that they delete the CLAUD.MD a lot because they want the new models to take over. That insight tells us a lot that we cannot just settle for whatever prompts we have. Besides that, depending on how we write the prompt, we might use more or fewer tokens; there are ways to better structure agents, workflows, and skills. For this blog post, I will cover some lessons learned while building and improving agents, workflows, and skills. I did a bunch of experiments; in fact, I wrote 7 incarnations of my agent skill. To test the agent's skill, I asked the agent to build a Twitter-like application so I could evaluate the quality of the code and solution as a proxy for the agent's s...

AI coding Agents Evolution

Image
AI coding Agents like Claude Code , OpenAI Codex , and Gemini CLI have disrupted how software engineering is done. IMHO, the most disruptive agents are Claude code and Codex. However, a lot of things already happened, some progress has been made, and there is some evolution in the space. We saw the birth of custom and subagents to avoid passing the whole context window down, custom commands  to have more control over a workflow, or when a specific task is executed. Hooks  add more determinism and make sure tests and linters are executed as part of the guardrails. From the explosion of MCPs to Multi-Agent Systems. There are many interesting changes and evolutions happened, we learned somethings while some things are still to be learned. For this blog post, I will cover some of the evolution in AI coding agents (mainly around Claude code). I did a lot of POC with agents, 74 Agent-related POCs at the moment. One thing I keep saying is that POCs are getting expensive, now not ...

AI Agent Infrastructure

Image
The One does not simply use AI Agents in production. Before using AI agents in production, we need to understand that LLMs are token prediction machines and by nature are non-deterministic . No matter how good you specs are, AI will drop packages and make mistakes. Lack of determinism is just one aspect we need to keep in mind. We also need to keep in mind that it's very easy to jailbreak the models . Adding a chatbot directly to customers has dangers and not only in a security sense, but also for misuse and potentially legal problems. Even if that is all somehow managed and risk is minimized with proper guarantees, one still does not just use agents in production. 20-15 years ago, we would not just deploy APIs to production; we would use an API Gateway. Considering agents and LLMs, we need the same: an AI gateway infrastructure. What happens if your API provider (Anthropic, Google, or OpenAI, for instance) is down? Is your business down? 

State Induction

Image
Imagine you are coaching a basketball team. You want to train your team to be good at 2-point shooting from the inside. Now imagine for some weird reason you can't test that, and you need to play a whole 4 quarters basketball game in order to be able to maybe, with a lot of luck, score 2 points. That would suck, right? This actually sounds insane because we all know we can skip the whole game and just train 2 points from inside right? Well, what IF I told you the basketball game is often a people test software, and they cannot train exact scenarios (State Induction), and they actually need to test the whole thing (expensive E2E testing). What if we could write tests in a very different way, so that it would allow us to have massive parallelism, and perhaps multiple people could test the same thing at the same time, and it would work.

AI Transformations

Image
From time to time, the industry has a breakthrough, and things change. Sometimes the improvement is incremental, and other times it is very disruptive. Not all change stays forever; actually, new technologies tend to die sooner than old inventions like dishes, forks, knives, spoons, glasses, and many more. I remember web services dominating the corporate universe until rest came and put them to almost extinction. I remember EJB rise and fall like a flash, Netscape, and many others. Those transformations are not new, and they do happen from time to time. Before AI, we had other movements and other breakthroughs like DevOps, Cloud Computing, Agile, Mobile Phones, the Internet, and the Personal computer. AI, perhaps, is the most disruptive force we have seen so far. No other technology or movement has such mystique as AI does. Some call it the Genie; others, the Revolution of the machines (Skynet); others think it's AGI already. One interesting effect we see at this point is that AI b...