Backup

Enterprise

Bacula Enterprise Only

This solution is only available for Bacula Enterprise. For subscription inquiries, please reach out to sales@baculasystems.com.

The backup workflow is snapshot-driven.

How the Backup Works

  1. The plugin connects to the selected filesystem implementation.

  2. A snapshot is created on the directory configured with base_snap_dir.

  3. Full jobs walk the current snapshot contents directly.

  4. Incremental and differential jobs compare the current snapshot with the previous Bacula snapshot when it exists.

  5. Bacula receives file attributes and streams through the standard plugin protocol.

  6. The snapshot is kept or cleaned up according to the configured retention settings.

Backup Behavior

  • Full backup: the current snapshot is enumerated directly.

  • Incremental or differential backup: the plugin uses HDFS snapshot diffs when the previous Bacula snapshot is available.

  • Snapshot fallback: if the previous Bacula snapshot cannot be found, the job degrades to a full snapshot walk.

  • Local mode: when fstype=local is used, the plugin walks the local filesystem and backs it up as ordinary files.

  • Filtered backup: include and exclude filters are applied before the file is sent to Bacula storage.

Example Backup Job

JobDefs {
  Name = BackupJob
  Type = Backup
  Pool = Default
  Storage = File
  Messages = Standard
  Priority = 10
  Client = bacula-hdfs-fd
}

Job {
  Name = PluginHdfsTest
  JobDefs = BackupJob
  Fileset = FS_Hdfs
}

Fileset {
  Name = FS_Hdfs
  Include {
    Plugin = "hdfs: user=hadoop url=hdfs://localhost:9000 base_snap_dir=/hdfs-regress-test include=*/logs/*"
  }
}

Example bconsole Session

run job=JOB_HDFS_HA_MyCluster
Using Catalog "MyCatalog"
Run Backup job
JobName:  JOB_HDFS_HA_MyCluster
Level:    Full
Client:   bacula-hdfs-fd
Fileset:  FS_HDFS_HA_Mycluster
Pool:     Default (From Job resource)
Storage:  File (From Job resource)
OK to run? (yes/mod/no): yes
Job queued. JobId=889
wait
*list joblog jobid=889
 Automatically selected Catalog: BaculaCatalog
 Using Catalog "BaculaCatalog"
 +----------------------------------------------------------------------------------------------------+
 | logtext                                                                                              |
 +----------------------------------------------------------------------------------------------------+
 | bacula-director-dir JobId 889: Start Backup JobId 889, Job=JOB_HDFS_HA_MyCluster.2026-06-10_10.43.39_58 |
 | bacula-director-dir JobId 889: Connected to Storage "DiskAutochanger" at bacula-director:9103 with TLS |
 | bacula-director-dir JobId 889: Using Device "DiskAutochanger_Dev1" to write.                  |
 | bacula-director-dir JobId 889: Connected to Client "bacula-hdfs-fd" at bacula-hdfs:9102 with TLS |
 | bacula-hdfs-fd JobId 889: Connected to Storage at bacula-director:9103 with TLS     |
 | bacula-director-sd JobId 889: Volume "Vol-0025" previously written, moving to end of data.    |
 | bacula-director-sd JobId 889: Ready to append to end of Volume "Vol-0025" size=6,888          |
 | bacula-hdfs-fd JobId 889: hdfs: Jar Version: 1.0.0 | Java version: 1.8.0_412 | Java Pid: 17585 |
 | bacula-hdfs-fd JobId 889: hdfs: Maximum java memory configured: 3.56 GiB                   |
 | bacula-hdfs-fd JobId 889: hdfs: Loaded core-site.xml from: /opt/bacula/etc/hdfs/core-site.xml |
 | bacula-hdfs-fd JobId 889: hdfs: Loaded hdfs-site.xml from: /opt/bacula/etc/hdfs/hdfs-site.xml |
 | bacula-hdfs-fd JobId 889: hdfs: Connected to hdfs://mycluster                              |
 | bacula-hdfs-fd JobId 889: hdfs: Creating snapshot for: Path=/user/ana Name=bsnap_JOB_HDFS_HA_MyCluster.2026-06-10_10.43.39_58 |
 | bacula-hdfs-fd JobId 889: hdfs: Snasphot created: Path=/user/ana Name=bsnap_JOB_HDFS_HA_MyCluster.2026-06-10_10.43.39_58 |
 | bacula-hdfs-fd JobId 889: hdfs: Deleted Snapshot of:/user/ana with name:bsnap_JOB_HDFS_HA_MyCluster.2026-06-10_10.43.39_58 |
 | bacula-hdfs-fd JobId 889: hdfs: Disconnected from Hadoop instance                          |
 | bacula-director-sd JobId 889: Elapsed time=00:00:26, Transfer rate=26600  Bytes/second          |
 | bacula-director-sd JobId 889: Sending spooled attrs to the Director. Despooling 1,326 bytes ... |
 | bacula-director-dir JobId 889: Bacula Enterprise bacula-director-dir 18.2.4 (08Jun26):
   Build OS:               x86_64-redhat-linux-gnu-bacula-enterprise redhat (Blue
   JobId:                  889
   Job:                    JOB_HDFS_HA_MyCluster.2026-06-10_10.43.39_58
   Backup Level:           Full
   Client:                 "bacula-hdfs-fd" 18.2.4 (10Mar26) x86_64-redhat-linux-gnu-bacula-enterprise,redhat,(Core)
   FileSet:                "FS_HDFS_HA_Mycluster" 2026-06-10 10:42:04
   Pool:                   "DiskBackup365d" (From Job resource)
   Catalog:                "BaculaCatalog" (From Client resource)
   Storage:                "DiskAutochanger" (From Job resource)
   Scheduled time:         10-Jun-2026 10:43:39
   Start time:             10-Jun-2026 10:43:42
   End time:               10-Jun-2026 10:43:48
   Elapsed time:           26 secs
   Priority:               10
   FD Files Written:       22456
   SD Files Written:       22456
   FD Bytes Written:       11 (11 B)
   SD Bytes Written:       1,597 (1.597 MB)
   Rate:                   0.0 KB/s
   Software Compression:   None
   Comm Line Compression:  33.8% 1.5:1
   Snapshot/VSS:           no
   Encryption:             no
   Accurate:               no
   Volume name(s):         Vol-0025
   Volume Session Id:      16
   Volume Session Time:    1780952502
   Last Volume Bytes:      9,083 (9.083 KB)
   Non-fatal FD errors:    0
   SD Errors:              0
   FD termination status:  OK
   SD termination status:  OK
   Termination:            Backup OK |
 | bacula-director-dir JobId 889: Begin pruning Jobs older than 6 months .                       |
 | bacula-director-dir JobId 889: No Jobs found to prune.                                        |
 | bacula-director-dir JobId 889: Begin pruning Files.                                           |
 | bacula-director-dir JobId 889: No Files found to prune.                                       |
 | bacula-director-dir JobId 889: End auto prune.                                                |
 +----------------------------------------------------------------------------------------------------+
 +-------+-----------------------+---------------------+------+-------+----------+----------+-----------+
 | jobid | name                  | starttime           | type | level | jobfiles | jobbytes | jobstatus |
 +-------+-----------------------+---------------------+------+-------+----------+----------+-----------+
 |   889 | JOB_HDFS_HA_MyCluster | 2026-06-10 10:43:42 | B    | F     |    22456 |  1635328 | T         |
 +-------+-----------------------+---------------------+------+-------+----------+----------+-----------+

Operational Guidance

Backup is usually scheduled as one job per snapshot directory so that snapshot retention and troubleshooting remain easy to reason about. When the namespace is large, prefer tighter include or exclude filters to reduce the amount of data traversed inside each snapshot.

Kerberos Example

A Kerberos-enabled cluster usually combines the Hadoop connection parameters, local XML files copied onto the File Daemon host, and the Kerberos credentials required by the target services.

Fileset {
  Name = FS_Hdfs_Kerberos
  Include {
    Plugin = "hdfs: user=hadoop url=hdfs://nn01.example.com:8020 core_site_path=/etc/bacula/hdfs/core-site.xml hdfs_site_path=/etc/bacula/hdfs/hdfs-site.xml user_principal=hadoop@EXAMPLE.COM keytab=/etc/bacula/hdfs/hadoop.keytab service_principal=hdfs/_HOST@EXAMPLE.COM base_snap_dir=/data/hadoop"
  }
}

This example keeps the connection parameters close to what is typically used in production Hadoop clusters, while still making the authentication inputs explicit.

See also

Next articles:

Go back to: Operations.