Understanding Heroku Logs
July 23, 2016
I’ve been using heroku for hosting my apps for free. However, free accounts on heroku are not provided with metrics.
Heroku logs provide a lot of information about about the app,
but searching them even with grep
seems very tedious.
This post describes my attempt to parsing and
visualizing the server logs to make performance monitoring easier.
The Log Structure
While the heroku logs
command gives a collection of different types of logs, the following structure is
common among them :
- Each new log is on a new line and has space separated elements.
- The first element is an ISO 8601 timestamp with nanosecond precision.
- The second element is the source process of the log.
- Everything after the second element is the content of the log.
[timestamp] [source] [key1]=[value1] ...
On the basis of the content of the log, it can be classified into the following major categories :
- Log about an incoming request. It contains the HTTP Verb, the request URL and the time taken by the server to send a response in milliseconds
- Log about memory consumtion of the app. Information about swap and cache use, total memory use and number of pages read and written.
- Log about average load. Load averages at 1 minute, 5 minutes and 15 minutes.
- Miscellaneous Log. Logs from the app etc.
Parsing
Next I got to parsing. Following is the python 2.7 script I used to parse the logs and convert them to json.
Here, I used the shlex
module which makes it very convinient to read quoted strings.
Visualisation
The json generated by the parsing script can be loaded directly as a pandas DataFrame.
memory_cache | memory_pgpgin | memory_pgpgout | memory_quota | memory_rss | memory_swap | memory_total |
---|---|---|---|---|---|---|
0.00MB | 1332855pages | 1331345pages | 512.00MB | 153.61MB | 4.70MB | 158.31MB |
Using matplotlib and seaborn, I was able to generate the following visualizations.
They can be interpreted as per the descriptions on heroku: https://devcenter.heroku.com/articles/metrics
Also, using plt.gcf().autofmt_xdate()
when plotting time series data makes the time readable.