Dedup 2 Index Rebuild
The Bacula Systems Support Team may ask you to rebuild your deduplication engine version 2 index. This article provides instructions on how to perform this process.
Prerequisites
Before proceeding with the rebuild, ensure that the storage space where the index is stored has a maximum occupancy of 40%.
If the current storage location doesn’t meet this requirement, choose an alternative destination folder that can adequately host the rebuilt index. The -o option of the `bdde-fsck`command permits to indicate the full path of the new index.
Following these guidelines will help ensure a smooth index rebuild process. The subsequent sections will detail the step-by-step procedure for rebuilding your dedup index.
Instructions
1. Stop the Storage Daemon: the proper use of the bdde-fsck
and bdde-check
tools
require that the Storage Daemon is not running.
$ sudo systemctl stop bacula-sd
Rebuild the index: Use
bdde-fsck
.
In the example below, we use 10 threads (-T 10) to scan the container files in parallel and create smaller meta data files of the results.
These meta data files are stored in a subdirectory (verify
) where the index
file is located. Their names match the container files
(like verify-00000000.ctn
for beed2-00000000.ctn
). These meta datafiles are much
smaller (44 bytes/index record) than the container files themselves and their
total size should be smaller than the size of the current index file.
The container scanning phase of the process is designed to be restartable. In the event of a failure before completion, it can be restarted without the need to rescan containers that have already been successfully scanned.
The following information is required to set up the bdde-fsck
command
- examples are included:
the filename of index file:
/mnt/index/beeindex.tch
the directory where the container files are stored:
/mnt/containers
the output filename for the new index file:
/mnt/index/repair.tch
the output filename for the repair log:
/mnt/index/repairlog.txt
Before running bdde-fsck
, the user must set up the LD_LIBARY_PATH
environment variable:
$ export LD_LIBRARY_PATH=/opt/bacula/lib
Then, run the tool in the background with /usr/bin/nohup
to the repair log:
$ /usr/bin/nohup /opt/bacula/bin/bdde-fsck -i /mnt/index/beeindex.tch -D /mnt/containers -s -R -o /mnt/index/repair.tch -g -T 10 -dt > /mnt/index/repairlog.txt 2>&1 &
3. Monitoring progress (optional): The rebuild can take from several hours to many days depending on the amount of data in the containers and the performance of the container storage. The verify meta data sizes can be used to estimate progress.
For example:
First, determine the size in KiB of the verify directory:
$ du -sk /mnt/index/verify
38524
Second, use this result to compute the number of index entries that have been scanned from the containers:
$ expr 38524 \* 1024 / 44
896558
In this simple example there have been almost 900 thousand records scanned.
This number can be estimated against the number of records in the current index
to determine the progress. The current size of the index can be found with information
in the SD tracelog or with beed-check
on the current index, the RNUM (or record number):
$ /opt/bacula/bin/bdde-check -i /mnt/index/beeindex.tch
4. Proper completion: When the bdde-fsck
job completes, there should be messages
in the log file (/mnt/index/repairlog.txt
):
: bdde-fsck.cc:595-0 syncing index
: bdde-fsck.cc:597-0 index closed
: bdde-fsck.cc:1103-0 Rebuild of the index ok
The user should submit the repairlog.txt
along with the output from bdde-check
for
the new index file for review in a support ticket before moving on:
$ /opt/bacula/bin/bdde-check -i /mnt/index/repair.tch
Note
Depending on the errors/warning found on the repairlog.txt
file,
additional steps (to be determined with Bacula Systems Support Team) are required.
5. Backup of meta data: Once the rebuild process has been confirmed correct, a backup of the meta data files should be made to avoid rescanning the containers if some other error presents itself:
$ mkdir /mnt/index/verify.bkp
$ cp /mnt/indext/verify/* /mnt/index/verify.bkp
6. Copy the new index (repair.tch
) to be the current dedup2 index (beeindex.tch
),
and start the Storage Daemon. At this point we are ready to start the SD with the new
index file.
Note
Do not run any backup jobs or vacuum processes yet!
$ cp /mnt/index/repair.tch /mnt/index/beeindex.tch
$ sudo systemctl start bacula-sd
Send a tail -n 1000 of the SD tracelog to Bacula Systems Support Team for review before moving on.
$ tail -n 1000 /opt/bacula/working/bacula-sd.trace
Run a vacuum checkmiss to mark any unused space in the containers. Using bconsole:
$ dedup vacuum checkmiss storage=<dedup2_storage_name>
Result
Dedup2 index has been rebuilt, and the Storage Daemon is ready to resume normal operations.
Go back to the Global Endpoint Deduplication 2 page.