GoAccess Automated Reports - Last 30+ Days via Cron

First if you're not familier with GoAccess, here's a quick description from their website http://goaccess.io/:

GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems. It provides fast and valuable HTTP statistics for system administrators that require a visual server report on the fly.

Typically you run goaccess on a single log file like this: goaccess -f access.log. But in our case we wanted to run it on multiple log files for a month end report. This provided some issues since we needed to be able to pass multiple files and only report within a date range.

Below is the script we came up with that's not perfect, but it's simple as does a pretty good job of what we needed. It finds all the gzipped access.log files modified within the past 35 days and pipes them into goaccess. We chose 35 days since it's possible some files may contain multiple days so 35 files should always include at least 30 days. Finally we save the report as an HTML file by the date. monthly-2015-05.html.

DATE=$(date +'%Y.%m')
zcat `find /var/log/apache2/ -name "access.log.*.gz" -mtime -35` | goaccess > /dir/monthly-$DATE.html

Then just save the file and add the cron job to run at midnight on the first of each month.

# as a shell script
00 00 01 * * /bin/bash /dir/goaccess-monthly.sh

# or as a single cron job line
00 00 01 * *  zcat `find /var/log/apache2/ -name "access.log.*.gz" -mtime -35` | goaccess > /dir/monthly-$(date +'%Y.%m').html

Also check out the man page for more information on various options and settings.

comments powered by Disqus