Skip to content

Instantly share code, notes, and snippets.

@gnanet
Last active March 28, 2025 17:33
Show Gist options
  • Save gnanet/a6f6133d9828f3457f0948f7c87e8de9 to your computer and use it in GitHub Desktop.
Save gnanet/a6f6133d9828f3457f0948f7c87e8de9 to your computer and use it in GitHub Desktop.
2vac or not2vac - fail2ban status persistence vs exploding diskusage of fail2ban.sqlite3

Faced recently storage shortage on a server again, so made my homework, its a Debian GNU/Linux 7.11 (wheezy)

An sqlite3 dump into a gzipped textfile shows an amazing size difference, that means without a reqular maintenance of the fail2ban database, the system may go low on free diskspace.

# du -hs fail2ban.sqlite3
2.5G    fail2ban.sqlite3

# sqlite3 ./fail2ban.sqlite3 .dump | gzip -c - > fail2ban.sqlite3.sql.gz

# du -hs fail2ban.sqlite3.sql.gz
188M    fail2ban.sqlite3.sql.gz

# sqlite3 ./fail2ban.sqlite3 .dump | wc -l
4389727

# zcat fail2ban.sqlite3.sql.gz | wc -l
4389727

It is known, that before v.0.11, fail2ban had a bug, so purge was never invoked.

On this system there is only fail2ban v0.9.7, so i looked around for a solution.

Known problems, that accompanies a self made vacuum solutions are widely known:

  • changes, where a lot of rows are involved, are creating a journal approx. same size than the db itself
    • without the purge there will be lot of rows to delete starting 2016.Jul.17 02:23:27, if i want to keep at most 1 week

My estimations of time required for the whole operation caused me to consider on-the-fly cleanup, and copy the db and do the cleanup, with the risk, that by the time i finsh there will be too much differences to merge. I decided to stop fail2ban, and take the risk of being without it for the time of the cleanup.

Now move on with the numbers.

  • The grand total of the dump: 4389727 rows
  • Counting a sum of the items grouped by jails that are more than 1 week old: 4388242 rows
  • So there were 4388242 rows to delete from fail2ban.sqlite3
# sqlite3 ./fail2ban.sqlite3 .dump | wc -l
4389727

# sqlite3 ./fail2ban.sqlite3 "select jail, count(*) as bancount from bans WHERE timeofban < (CAST(strftime('%s') as INT) - 604800) group by jail;"
recidive|31826
sshd|4247455
wprecidive|108961

# echo $(( 31826 + 4247455 + 108961)) 
4388242

Lets start the delete

# sqlite3 ./fail2ban.sqlite3 "delete from bans WHERE timeofban < (CAST(strftime('%s') as INT) - 604800);"

After the delete process (which took long hours, and the journal grew to the size of the db) the size of the fail2ban.sqlite3 has not changed. It was time to do the cleanup, so i ran VACUUM on the db Luckily, the process did not last long, and as expected, the journal grew to almost the size of the db, but both times the journal vanished after finishing the operation.

# sqlite3 ./fail2ban.sqlite3 "VACUUM;"

The grand total of the resulting db's dump: 1485 rows Size of the final database less than 1 MB!!!

# sqlite3 ./fail2ban.sqlite3 .dump | wc -l
1485

# du -hs fail2ban.sqlite3
904K    fail2ban.sqlite3

After replacing the fail2ban.sqlite3 and starting fail2ban-server i saw there were no issues with the result of my work.

The next step will be to create a shell script, that does a daily delete and vacuum, which i expect to finish quickly so it can be done on-the-fly.

But that is the future.

#!/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
DB_PATH=/var/lib/fail2ban
F2B_DB=fail2ban.sqlite3
if [[ "$1" != "run-cron" ]]; then
echo "$(basename $0) is protected from accidental runs" >&1
exit 127
fi
BEFORE_PID=$(pgrep fail2ban-server)
if [[ "x${BEFORE_PID}" != "x" ]]; then
logger -t $(basename $0) "INFO: fail2ban-server running"
fi
BANS_OVERAGED=$(/usr/bin/sqlite3 ${DB_PATH}/${F2B_DB} "select count(*) from bans WHERE timeofban < (CAST(strftime('%s') as INT) - 604800);")
logger -t $(basename $0) "INFO: About to delete ${BANS_OVERAGED} older than a week bans"
/usr/bin/sqlite3 ${DB_PATH}/${F2B_DB} "delete from bans WHERE timeofban < (CAST(strftime('%s') as INT) - 604800);"
BANS_CONTROL=$(/usr/bin/sqlite3 ${DB_PATH}/${F2B_DB} "select count(*) from bans WHERE timeofban < (CAST(strftime('%s') as INT) - 604800);")
if [[ ${BANS_OVERAGED} -le ${BANS_CONTROL} ]]; then
logger -t $(basename $0) "FAIL: after delete of ${BANS_OVERAGED} old bans, there are still ${BANS_CONTROL} bans in the database"
echo "FAIL: after delete of ${BANS_OVERAGED} old bans, there are still ${BANS_CONTROL} bans in the database" >&2
exit 1
fi
logger -t $(basename $0) "INFO: About to vacuum the database"
/usr/bin/sqlite3 ${DB_PATH}/${F2B_DB} "VACUUM;"
AFTER_PID=$(pgrep fail2ban-server)
if [[ "x${AFTER_PID}" != "x" ]]; then
logger -t $(basename $0) "SUCCESS: After vacuum, fail2ban-server running"
else
logger -t $(basename $0) "FAIL: After vacuum, fail2ban-server is not running"
echo "FAIL: After vacuum, fail2ban-server is not running" >&2
exit 1
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment