Posts

Showing posts with the label Observability

Tagging Everything

Image
I was always a big believer in metadata and observability. Tagging is another form of observability, the idea is very basic and yet not well explored in our industry. You add metadata to a resource. Metadata is just data that describes data. Why bother? Well, at scale, there will be hundreds to thousands of resources including ec2 machines, container images, security groups, load balancers, and all kinds of applications like services, bffs, aggregators, a much more. How do you make sense of these resources? How do you know if you need them after all? Cloud computing is great but also is a big cost center. Understanding your resources is critical, not only for savings but for better infrastructure management. Tags help with cost, but they go beyond cost.  Endless Resources So let's say you have some scale, easily you can have hundreds to thousands of ec2 and dozens to hundreds of lamdas. The first question that should come to mind is ownership, how own all these resources? By not ha...

Health Checker: Java Embedded Server

Image
Modern Cloud Services need to have health checkers. Even some shared libraries might make sense to have health checkers. There is a lot of debate in the community on how the health checker should be implemented. If should just return dummy 200 hardcoded or if should check all dependencies of the services like database connections and other essential downstream dependencies. Which would require us to have a background thread in order to do it in a sane way. IMHO both can work. If you have observability into the other components is not required to make the service look all that, if you dont have or dont have to control to add it, it might be a good idea to have on the service. Besides health checkers, I believe we need to have greater Observability in Services and even for some shared libraries and we should collect metrics and expose config values other key aspects. IF you already have a service let's say running Spring Boot(Tomcat or Netty) or Quarkus, you already have a service. H...

Spring Boot Groovy Console

Image
Observability is a must-have nowadays in modern cloud services. There are multiple levels and options to provide observability into a service. Logging is the basic one but is well used can perform miracles, metrics, dashboards, and alerts and at the top of the chain however, the true nirvana is to have a self-managed / self-healing system and for troubleshooting a query interface. If you let it sink for a moment and think about it you will realize we often had such power in the past with relational databases with a SQL client for instance. However using a Spring Boot stack using polyglot persistence, with JPA, Reactive Programing, Multiple property sources, and Beans it's easy to get into the application and shared libs internals. So what's best? To have access to all properties at runtime and spring beans and be able to execute CODE in a very fluent way. That's where Groovy comes into the play. Inspired by Gabor Bata work , today I will show how we can get that and more w...

Java Agents

Image
 Java Agents are an interesting capability of the JVM. Agents can either be Static(Load when the java app starts with a special flag) or they can be Dynamic(using the dynamic API from java we can dynamically bind to a specific JVM PID. Agents can be used to run any code before the app starts or even to change the bytecode. The cool thing about agents is the fact that is a runtime thing and we do not need to change the source code of the target app. Agents are similar to Aspects but IMHO much better. Mockito uses mocks in order to test difficult scenarios, pretty much all observability solutions for logs and metrics also have agents. Today I want to share 2 pocs, one using a vanilla java app and doing bytecode manipulation, The second using Spring Boot 2.x and running code forever in a background thread as the app also runs. So Let's get started!

Lo4j2 Async Logger and Benchmark

Image
Lo4j is an old logging framework. Sometime ago Logback was faster than lo4j. Now, lo4j2 is the fastest and coolest kid on the block.  Logs are the eldest and primitive form of observability. IMHO observability has many different use cases such as Troubleshooting, Business Discovery, Performance Troubleshooting, Reliability Assurance, and Testing. IMHO you could use logs for all those use cases, however, IMHO logs are best suited for troubleshooting and metrics and better suited for general observability. Logs can destroy application performance and often are the number 1 culprit of slowness. Log4j2 makes logging faster due to async logging and LMAX Disturptor as the secret source. However I would question if you just use logs nor dont use metrics and traces. 

Observability & Domain Observability: From Understanding to Value

Image
Observability is a must-have a property any mature distributed system solution and/or digital product.  There is no way to "buy" observability, you need to earn it. Observability is like car insurance. No one likes to pay insurance, however, if there is a care crash you do really really want it. However, you cannot acquire the insurance at the very moment of the crash it does not work that one. The car insurance metaphor would work partially If you have a monolith system. As you have microservices, therefore distributed systems, you have many more failure points and in order to detect that the "car crashed" could be much more hard and complicated. There are many challenges in order to have observability on the system. The main idea is that Observability is something LIVE which means it is not a one time job. So you want to have a better understanding of your system via instrumentation which will lead to better observability and this loop keep going. Some time abou...