Director and Catalog Configuration

Disks Considerations

Catalog Size

For 1 billion objects, the Catalog will use around 300 GB to 400 GB of space. The number of objects depends on your retention period requirements. Bacula will prune file records, thus reducing the number of database objects and the database space used, after the retention period you specify. For common operations, the database needs some free space (between 10 and 20 GB) in the temporary disk area. For best performance, we advise you to stay under 80% of used space on your filesystems.

RAID Configuration

Bacula Systems advises using a RAID configuration that provides write cache capability, which will usually require a hardware RAID HBA protected with a battery. Typically RAID 10 will provide the best performance and redundancy. Today, using SSDs to build the database storage system is a reasonable approach to achieve the best performance.

Whenever using a hardware RAID controller, it is essential to have a battery-backed cache when write caching is configured, which is advised for best performance. Bacula Systems recommends using a dedicated HBA known for good performance for the database disk system.

Disks Layout

As with all database systems, you can optimize disk throughput using different physical devices for your database. Generally, you can get performance enhancements by using different physical disks for Write Ahead Log (pg_wal) files and data files. Writes to the WAL are sequential, and it is generally only read during recovery or for replication. Data files (heap and index) I/O are often random and it makes sense to optimize data storage for fast writes.

Filesystems

A modern filesystem is recommended for all PostgreSQL data, XFS, EXT4 and ZFS are most used. Old filesystems like ext2/ext3 should be avoided.

Using the deadline or noop disk scheduler can also improve throughput significantly. Please note that some recent systems offer mq-deadline now:

# echo deadline > /sys/block/sda/queue/scheduler

To use it for all disks on boot, add the elevator=deadline option to the kernel command line of the boot loader (for GRUB, this is usually /boot/grub/grub.conf).

PostgreSQL executes system fsync calls to flush data to disk. With a properly set up battery-backed RAID controller providing non-volatile cache write barriers can be turned off. Mounting the volume with the nobarrier option disables barriers for the volume.

Testing Hardware

For optimal performance, you will need to iterate configuration changes and tests on your server a few times before using it in production. Tests with tools such as bonnie++ or iozone can be very useful.

Memory Considerations

Depending on your configuration, you can need considerable amounts of memory for your Catalog server. PostgreSQL logs give you information on the memory used by Bacula. It’s rather easy to know when you need to add more memory.

The amount of memory needed depends a lot on your production needs. If you are planning to handle multi-million file jobs within a very small backup window, the Catalog server will need a fast storage system and lots of memory.

On servers with large memory, you need to adjust the kernel write cache to ensure smooth flushing to the disk.

# cat /etc/sysctl.conf
vm.dirty_ratio = 2
vm.dirty_background_ratio = 1

vm.swappiness = 1
vm.zone_reclaim_mode = 0

# sysctl -p

If the server has a large amount of memory, we advise disabling the kernel overcommit.

# tail /etc/sysctl.conf

vm.overcommit_memory = 2

# sysctl -p

Server Monitoring

We recommend that you install sysstat tools to have CPU, disk, and network usage information available if fine-tuning becomes necessary.

Additional monitoring of CPU and memory usage, disk space availability, and disk system availability (like I/O wait times or queue lengths) can also be desirable. A network monitoring system like Nagios or Zabbix can help with monitoring, logging, and alerting.

CPU Considerations

The number of CPUs depends on the number of jobs that you want to run concurrently, you can expect to run 5 to 10 jobs in parallel per CPU core smoothly.

Server Configuration

To run a large number of concurrent jobs, you need to configure the nofile and nproc ulimit setting for the root, the bacula and postgres unix account.

# cat /etc/security/limits.conf
root soft nofile 10000
root hard nofile 10000

root soft nproc 10000
root hard nproc 10000

bacula soft nofile 10000
bacula hard nofile 10000

bacula soft nproc 10000
bacula hard nproc 10000

Having 200 jobs in parallel consumes an average of 1024 open files and 500 threads. Multiply the number of jobs in parallel with 50 for the limit of open files. On some Linux distributions, the services will not read the /etc/security/limits.conf configuration file to set limits, for example, with systemd, the settings are in the set in the service file. Bacula Systems recommends configuring the limit of open files in both Director and Storage Daemon(s) hosts, considering the number of parallel jobs that will be run.

See also

Go back to:

Go to:

Go back to the Performance Tuning of Director and Catalog Host chapter.

Go back to the Bacula Enterprise Performance Fine Tuning chapter.