Bacula database cleanup

By Stephane Carrez

Bacula maintains a catalog of files in a database. Over time, the database grows and despite some automatic purge and job cleanup, some information remains that is no longer necessary. This article explains how to remove some dead records from the Bacula catalog.

Bacula maintains a list of backup jobs that have been executed in the job table. For each job, it keeps the list of files that have been saved in the file table. When you do a restore, you somehow select the job to restore and pick files from that job. There should not exist any file entry associated with a non existing job. Unfortunately this is not the case. I've found that some files (more than 2 millions entries) were pointing to some job that did not exist.

Discovering dead jobs still referenced

The first step is to find out which job has been deleted and is still referenced by the file table. First, let's create a temporary table that will hold the job ids associated with the files.

mysql> create temporary table job_files (id bigint);

The use of a temporary table was necessary in my case because the file table is so big and the ReadyNAS so slow that scanning the database takes too much time.

Now, we can populate the temporary table with the job ids:

mysql> insert into job_files select distinct file.jobid from file;
Query OK, 350 rows affected (8 min 53.26 sec)
Records: 350  Duplicates: 0  Warnings: 0

The list of jobs that have been removed but are still referenced by a file is obtained by:

mysql> select from job_files
 left join job on = job.jobid
 where job.jobid is null;
| id   |
| 2254 | 
| 2806 | 
2 rows in set (0.05 sec)

Deleting Dead Files

Deleting all the file records in one blow was not possible for me because there was too many files to delete and the mysql server did not have enough resources on the ReadyNAS to do it. I had to delete these records in batch of 100000 files, the process was repeated several times (each delete query took more than 2mn!!!).

mysql> delete from file where jobid = 2254 limit 100000;


This cleanup process allowed me to reduce the size of the file table from 10 millions entries to 7 millions. This improves the database performance and speeds up the Bacula catalog backup process.