Architecture

Enterprise

Bacula Enterprise Only

This solution is only available for Bacula Enterprise. For subscription inquiries, please reach out to sales@baculasystems.com.

The HDFS Plugin is a Bacula File Daemon plugin that delegates filesystem access to a Java backend. In HDFS mode, the File Daemon acts as an HDFS client through the Hadoop Java SDK and uses the connection and authentication settings provided by the plugin parameters to reach the target cluster.

The backend uses the Hadoop Java APIs in HDFS mode and the Java filesystem APIs in local mode. The exact target reached by the client is determined by the HDFS connection and authentication parameters, such as url, user, config_file, core_site_path, hdfs_site_path, and ssl_client_path. See Configuration for the full parameter reference.

A backup job is processed through a snapshot manager, a file fetcher, a file opener, and a publisher that sends data back to Bacula. Restore jobs read Bacula packets and either rebuild files in HDFS or write them to a local destination directory.

The plugin also exposes list and query operations through the same backend, so snapshot discovery and backup use the same connection and authentication path.

Connectivity from the File Daemon usually requires access to the following HDFS endpoints, depending on how the cluster is configured:

  • NameNode RPC port, commonly 8020 or 9000.

  • NameNode HTTP or HTTPS port, commonly 9870 or 50070 for HTTP, and 9871 or 50470 for HTTPS.

  • DataNode transfer ports, commonly 9866 or 50010 for block transfer, and 9864 or 50075 for the DataNode web endpoint.

  • Kerberos KDC ports, commonly 88 and 749, when Kerberos authentication is enabled.

The XML configuration files used by the plugin can be copied from the HDFS system to a path on the File Daemon host, then referenced locally through the authentication and connection parameters above.

Below, there is a simplified vision of the architecture of this plugin within a generic Bacula Enterprise deployment. The figure highlights the File Daemon-side HDFS client, the Java backend and SDK layer, the HDFS cluster endpoints, and the optional local restore target used when where points to a filesystem path.

HDFS Plugin Architecture

HDFS Plugin Architecture

Go back to: HDFS Plugin.