Alerting and Monitoring with Sensu part 1

Sensu is an Infrastructure and monitoring / telemetry solution. There are big sites using it such as Yahoo, Yelp, Cisco, and Tesla.

Sensu provides standard solutions for monitoring and alerting.Which I can say is okay.  However If you have been following my Telemetry and Microservices blog posts series(part1 and part2) you will see sensu is not ideal for other use cases like: 

* Dynamic Thresholds / ML
* Infrastructure Tracking
* Canary

Because Sensu it doesn't have a Time Series database. You can use Sensu with Graphite or InfluxDB to work around it. However IMHO the solution it would be: have collectors(collectd) to get basic HOST metrics plus app instrumentation and them make checkers that do queries over the TSDB. 
Sensu Features

  • Sensu has a centralized dashboard thought Uchiwa
  • Alerting via checkers: You call a script or API which returns 0, 1 or 2. 
  • Notifications: Email, Slack, Hipchat, Pager Dutty. 
  • Nagios and Zabbix Script reuse. Since they all use the same format: 0,1 and 2.
  • Dynamic Client registration and de-registration
  • Lots of plugins most in ruby and python:
Sensu can be deployed in Cloud-Native environments or traditional bare metal data centers if you wish. Sensu has a client / server architecture based on a Pub/Sub model delivered by RabbitMQ.  Sensu server works with Debian and CentOS OS and sensu client can work with Debian, CentOS, Mac, Windows, AIX, Solaris, and FreeBSD. 

Sensu is heavily dependent on the OS. Sensu client must be installed on each HOST and the configs are stored in JSON files. Sensu client scripts often are running on the HOST side which means you need to copy the scripts to the HOST and have a provisioning solution like Ansible, Puppet or Chef.

I have lots of mixed feelings with this. In one side of the coin, this is great for operations because they know Ansible, however.  It makes all about provisioning and backing. What I don't like about it that you will need to generate a new version of the AMI and reploy yout app again. That is specially bad for me because letÅ› say you want to change a Threshold and them you will need to reploy everything because a single number? 

Sensu Architecture

Sensu works in a client-server architecture where the server schdulle checkers to run into the clients which are deployed into HOSTS. RabbitMQ is used a transport mecanism beetween sensu server and the clients. Results are stores into the memory using REDIS at the Sensu Server.  All this information can be queried by the APi which will use the information store in REDIS.

Sensu CORE Concepts

Sensu has same core concepts like: Checkers, Event Processors, Handlers, Mutators. Let's see they one by one with more details. 

Service Checkers: You have to use Checkers in order to do HOST monioring. For instance your web server disk is full or the CPU is high or the memory is growing up a lot are all thrsolds you might want be alerted on. So you have to write a script or call and API which will return 0 if its all good, 1 meaning something is in warning and 2 mening critical. 

Event Processing; This is how sensu allow you to so some simple trivial Event Processing. SO you can have another script which can run based on other handlers. You can so somethings, for instance you can take actions based on alerts. If the Disk is full you can use boto cli in AWS and get more disk, this can be executed based on a alert. You can use this feature to do custon alerting like sensing a email or a custon notification channel or even use this to store metrics into a diffrent place like a TSDB. You can also use other event processors such as Filters to get just the events you want or even use Mutators which can change data. 

Sensu Handlers: There are some types of Handlers in sensu. The default is the Pipe handler where you can use standard linux commands via STDIN. There are TCP and UDP handlers, Transport Handlers like RabbitMQ and a special kind of handler called SET handler which you can use to GROUP event handlers.

RESTful API: Sensu has a RESTful API that expose inofrmation about the clients, checks, events, results, aggregates ans stashes. 

Sensu is lightweight and easy to use. There are some fundamental issue like the backing issues and some bad choises like use a TEMP dir to store information which can gave you lots of pain :-)  Next blog post I will show how to work with Sensu in practice and provide a Vagrantfile so you can play with it easily. 

Diego Pacheco

Popular posts from this blog

Telemetry and Microservices part2

Installing and Running ntop 2 on Amazon Linux OS

Fun with Apache Kafka