10x: Logging at Clay.io

logstash logo

Managing 20+ servers as a small team is no easy task, and when things go wrong (they always do) figuring out what happened quickly is essential. Of course we can't ssh into each machine, that would take ages, so instead we use Logstash to aggregate our logs.

This is the second post in my 10x series, and if you missed last episode: Architecture at Clay.io.

Logstash overview

Logstash deployments have two parts. The aggregate server (or cluster), and the client servers. The aggregate server runs the main logstash binary, while the client servers run logstash-forwarder to ships logs to the logstash instance. Here is a diagram of our setup:

scribe

Note that logstash-forwarder doesn't rotate logs. Log rotation is the process of storing older logs (usually compressed) into buckets (either of size or by date) and then eventually expiring them. The (aptly named) tool for the job here is the logrotate linux utility.

Investigating logs

kibana

The logstash daemon runs ElasticSearch and Kibana to analyze logs. You can apply many filters and create complex queries into your log data this way because of ElasticSearch.

Note that you cannot expose ElasticSearch ports publicly or your server will get hacked (!). This happened to us, and our solution was to move our infrastructure to Amazon VPC so that the only way to get to the logs is to be inside of our privileged network.

Journalist + Scribe

At Clay.io we have open-sourced our Docker containers so that anyone can deploy a distributed logging system themselves in just a few seconds.

Logstash Server: clay/journalist-public
Logstash-forwarder Server: clay/scribe-public

Good Luck!