Analysing ELB logs with Logstash and Kibana

Last week, Amazon launched the ability to turn on per-request logging in Elastic Load Balancers (ELBs). This is a much sought-after feature that many users have been asking about for some time now, and finally it is here.

However, logging is only half of the battle. Once you have the logs you have to do something with them in order to be able to figure out if there are any problems that need addressing or improvements that can be made.

On the AWS announcement there are a couple of commercial services available which are able to work with the logging format directly, but we wanted to experiment with the logs first to see what we could get from them.

To that end, we decided to setup ElasticSearch, Logstash and Kibana to parse the logs from import.io load balancers so that we could kick off our analysis.

Setting up logging

The AWS blog post has a good guide to the practicalities of setting up ELB logging to S3 buckets.

As we have several load balancers we want to monitor in each CloudFormation Stack that we run, we decided to combine all of the load balancers from one stack into the same S3 bucket. We used a parent directory for each load balancer to differentiate them.

For example, we set up our front end load balancer to log to “s3://our-log-bucket/front”, the API one to “s3://our-log-bucket/api” and the query load balancer to “s3://our-log-bucket/query”. You can set this up easily using the AWS control panel. Just be aware that the IAM policy created for the bucket if you check the “Create the location for me” on the first one will need to be changed to allow the ELB IAM user write access to the whole bucket, if you are going to take this approach.

We would also recommend you set this up as soon as possible, so the ELB can be logging away while you are setting up the rest of the tooling (or need to come back to it later).

Setting up ElasticSearch, Logstash and Kibana

Since ElasticSearch runs both the Logstash and Kibana projects, it is very easy to get it started up.

The first task is to download the Logstash JAR. This comes with embedded ElasticSearch and Kibana instances, which we will use for this demo as it is very easy. It is also quick to swap these out to full Kibana and ElasticSearch deployments when you need to. You can download the latest JAR file from this link.

The final step for setup is configuration. This is the logstash configuration file we are going to use, save it in the same directory as your logstash JAR file, named “logstash.conf”:

You need to modify this file just to make the “/your/data/path” on the third line point to a directory that exists, which we will later populate with your ELB log data (make sure you leave the “/**/*.log” part on the end). On line 12 of this configuration you will see the matching line. This is the formatter that Logstash needs to know in order to parse data out of the ELB log file correctly, and this will pick out all of the pieces of data so we can use Kibana to analyse them later on. Getting your ELB log data Now that your ELBs are storing data in to S3, we need a way to retrieve them so that Logstash can pick them up. We use the very cool s3cmd tool here to work with S3 on the command line, and I am going to use this to get the entire contents of that bucket on to the machine so that we can store it into ElasticSearch. First, you need to configure IAM so that you can download the contents of the S3 bucket. We recommend creating a new IAM user for this task, giving them access to all S3 commands on the logging bucket using the bucket permissions tooling on a command line or the console. The final step is to start synchronising down the data. If you have the example data directory of “/your/data/path”, then you need the following commands, replacing <BUCKET_NAME> with the name of the S3 bucket your logs are stored in:

The command will show you its progress as it downloads each file, and how many downloads it has completed and has remaining.

Indexing the data

The penultimate step is to index the data we have collected from S3 into ElasticSearch so we can analyse it. We do this by running the logstash JAR we downloaded earlier. First, go to the directory you downloaded the JAR file to, then run it with the following command:

This will start Logstash as a collection agent, with the logstash.conf configuration we created earlier, and with the web UI running too.

As the app starts, it will log out all of the lines it is parsing from the ELB log files. You can turn this off by removing the “-v” argument from the above command.

You can actually start running this command while the files are downloading. Logstash will read the files as they are being downloaded (as much as it can keep up, at least) and index the entries as it goes.

Viewing and analysing the data

Now, the final step. While the logstash JAR file is running, its embedded ElasticSearch and Kibana instances can be viewed with a regular web browser. The link you need to get started is:

http://localhost:9292/index.html#/dashboard/file/logstash.json

This will open Kibana with the default logstash dashboard enabled. Here is the output that we get for one of the import.io staging environments for a few minutes:

All requests to some of our staging load balancers during a day All requests to some of our staging load balancers during a day

From here, we can leverage the full power of Kibana to analyse our logs to check for requests taking a long time, filtering on specific status codes, and more.

Here is an example where we have generated a chart of error (i.e. non-200 responses) for the same period, which highlights a problem area we may wish to investigate:

Requests to some of our staging load balancers that errored during the day Requests to some of our staging load balancers that errored during the day

You can also generate a pie chart of these erroneous response codes:

Pie chart of errors returned by some of our staging load balancers Pie chart of errors returned by some of our staging load balancers

If you have any comments or tips on using these tools then get in touch with us on Twitter!

Extract data from almost any website


INSTANT FREE ACCESS