11 May 2019

Logging A Lot In Journald

I've been dealing with a service recently that, at least for a while longer, needs to log a serious amount of data. We are talking hundreds of thousands of lines every hour. The challenge was getting systemd to keep more than a days worth of data at this rate.

My first pass at this was to set the SystemMaxUse to 50%. journald settings such as this can be changed using a configuration file in /etc/systemd/journald.conf.d/ (placing your own *.conf configuration file here to override the defaults). The quick start tip is to copy /etc/systemd/journald.conf to /etc/systemd/journald.conf.d/awesomelog.conf and make your changes.

After hitting save and restarting the journald service I assumed the task was sorted. Unfortunately not, investigating the service later showed I had only around 20 hours worth of logs. This was on a file system that had plenty of space still. journald hadn’t used anywhere near the 50% max I set…

[root]# du -chs /var/log/journal/
4.0G	/var/log/journal/
4.0G	total

Journal files up to 4GB exactly and no further? Seems a bit suspicious! It turns out SystemMaxUse does not work how I expected. Running man journald.conf and grepping for SystemMaxUse explains all (emphasis added by me). Also as a mini aside, settings prefixed with System are for persistent storage, where as Runtime refers to in-memory storage.

SystemMaxUse= and RuntimeMaxUse= control how much disk space the journal may use up at maximum.  SystemKeepFree= and RuntimeKeepFree= control how much disk space systemd-journald shall leave free for other uses. systemd-journald will respect both limits and use the smaller of the two values.

The first pair defaults to 10% and the second to 15% of the size of the respective file system, but each value is capped to 4G.

Now a bit of experimentation at this point told me that the 4G cap is ignored if you set the value in gigabytes and not as a percentage. That means we can achieve our 50% SystemMaxUse value, just we’ll need to set it absolutely based on the current machine. Not super portable but good enough!

We can confirm this change has really happened by restarting systemd-journald then using journalctl to look at the logs of systemd-journald (pretty meta!)

# Before at 50% (on a ~30GB disk)...
systemd-journal[15950]: Permanent journal is using 3.9G (max allowed 4.0G, trying to leave 4.0G free of 27.6G available → current limit 4.0G).

# After our 16G absolute value...
systemd-journal[16004]: Permanent journal is using 3.9G (max allowed 16.0G, trying to leave 4.0G free of 27.1G available → current limit 16.0G).

Success!

Extra: journalctl Tips (Basics?)

Perhaps not worth a blog post by itself but did you know you can specify relative times with journalctl? Previously I’d been using it on and off without realising this was a thing, just awkwardly typing out full dates and times. Works for various time units as you’d expect (e.g. --since -30m for last 30 minutes, or --since -8h for last 8 hours etc).

Another useful tip is, if you need to search for events over multiple services, you can specify the systemd unit multiple times and get a combined view. e.g. journalctl -u service-foo -u service-bar --since -10m.

Dev SysAdmin
Back to posts