Moving from Dedup 1 to Dedup 2

Starting with Bacula Enterprise Edition 16.0, the new dedup2 engine brings a new engine design and a new configuration format. Find more details in the Global Endpoint Deduplication 2 article.

It is not possible to simply reuse the current dedup1 index, containers and volumes data in the new dedup2 engine. It means you cannot configure the dedup2 engine in your environment, and make it use the current dedup1 index, containers and volumes.

However, it is possible to migrate data from the current dedup1 engine (now called dedup legacy; the terms are used interchangeably) to the new dedup2 engine. Also, you may keep the current dedup1 engine for restore purposes, whilst using the new dedup2 engine to store new backups.

In this document, a few scenarios will be presented to help adopting the new dedup2 engine, assuming you are already using the deduplication plugin with the old dedup engine layout and configuration.

Important

Before implementing any strategy in production, read carefully the Important Considerations section before implementing any strategy in production.

Important Considerations

  • Running two dedup engines in parallel may be quite intensive in regards to the I/O on the SSD disk, where the dedup index is stored, and also in regards to the I/O where containers are stored, especially if the containers are stored in NFS mounts.

  • If the same NFS mount is used to store the containers of both the dedup1 and the dedup2 engines in a migration process, you may have a high load on the NFS mount.

  • It is recommended to schedule migration jobs when backup jobs are not running.

  • In scenarios where the dedup1 engine is kept for restore purposes, note that the storage occupancy in dedup1 will be the same until this storage is fully decommissioned. This is because the containers cannot be shrunk, even if data is pruned.

  • In scenarios where the data is not migrated to dedup2, the new dedup2 engine will start from scratch without a high dedup ratio which also means that you can potentially have the very same data (chunks) in both dedup engines, and as a consequence, the storage occupancy will be high.

We present the scenarios that show how to move on from dedup1 to dedup2.

Scenarios for migration of data from dedup1 to dedup2:

Scenarios for keeping dedup1 for restore purposes only: