Database Tables

Table 9.1: Path table layout

Column Name

Data Type

Remark

PathId

integer

Primary Key

Path

Blob

Full Path

The Path table contains the path or directory names of all directories on the system or systems. The filename and any MSDOS disk name are stripped off. As with the filename, only one copy of each directory name is kept regardless of how many machines or drives have the same directory. These path names should be stored in Unix path name format.

Some simple testing on a Linux file system indicates that separating the filename and the path may be more complication than is warranted by the space savings. For example, this system has a total of 89,097 files, 60,467 of which have unique filenames, and there are 4,374 unique paths.

Finding all those files and doing two stats() per file takes an average wall clock time of 1 min 35 seconds on a 400MHz machine running RedHat 6.1 Linux.

Finding all those files and putting them directly into a MySQL database with the path and filename defined as TEXT, which is variable length up to 65,535 characters takes 19 mins 31 seconds and creates a 27.6 MByte database.

Doing the same thing, but inserting them into Blob fields with the filename indexed on the first 30 characters and the path name indexed on the 255 (max) characters takes 5 mins 18 seconds and creates a 5.24 MB database. Rerunning the job (with the database already created) takes about 2 mins 50 seconds.

Running the same as the last one (Path and Filename Blob), but Filename indexed on the first 30 characters and the Path on the first 50 characters (linear search done there after) takes 5 mins on the average and creates a 3.4 MB database. Rerunning with the data already in the DB takes 3 mins 35 seconds.

Finally, saving only the full path name rather than splitting the path and the file, and indexing it on the first 50 characters takes 6 mins 43 seconds and creates a 7.35 MB database.

Table 9.2: File table layout

Column Name

Data Type

Remark

FileId

integer

Primary Key

FileIndex

integer

The sequential file number in the Job

JobId

integer

Link to Job Record

PathId

integer

Link to Path Record

Filename

blob

Filename Record

MarkId

integer

Used to mark files during Verify Jobs

LStat

tinyblob

File attributes in base64 encoding

MD5

tinyblob

MD5/SHA1 signature in base64 encoding

DeltaSeq

integer

Delta Sequence number

The File Table contains one entry for each file backed up by Bacula. Thus a file that is backed up multiple times (as is normal) will have multiple entries in the File table. This will probably be the table with the most number of records. Consequently, it is essential to keep the size of this record to an absolute minimum. At the same time, this table must contain all the information (or pointers to the information) about the file and where it is backed up. Since a file may be backed up many times without having changed, the path is stored in separate tables.

This table contains by far the largest amount of information in the Catalog database, both from the stand point of number of records, and the stand point of total database size. As a consequence, the user must take care to periodically reduce the number of File records using the retention command in the Console program.

Table 9.3: Job table layout

Column Name

Data Type

Remark

JobId

integer

Primary Key

Job

tinyblob

Unique Job Name

Name

tinyblob

Job Name

PurgedFiles

tinyint

Used by Bacula for purging/retention periods

Type

binary(1)

Job Type: Backup, Copy, Clone, Archive, Migration

Level

binary(1)

Job Level

ClientId

integer

Client index

JobStatus

binary(1)

Job Termination Status

SchedTime

datetime

Time/date when Job scheduled

StartTime

datetime

Time/date when Job started

EndTime

datetime

Time/date when Job ended

RealEndTime

datetime

Time/date when original Job ended

JobTDate

bigint

Start day in Unix format but 64 bits; used for Retention period.

VolSessionId

integer

Unique Volume Session ID

VolSessionTime

integer

Unique Volume Session Time

JobFiles

integer

Number of files saved in Job

JobBytes

bigint

Number of bytes saved in Job

ReadBytes

bigint

Number of bytes read in Job

JobErrors

integer

Number of errors during Job

JobMissingFiles

integer

Number of files not saved (not yet used)

PoolId

integer

Link to Pool Record

FileSetId

integer

Link to FileSet Record

PrioJobId

integer

Link to prior Job Record when migrated

PurgedFiles

tiny integer

Set when all File records purged

HasBase

tiny integer

Set when Base Job run

Reviewed

tiny integer

Set when the error is acknowledged

Comment

tinyblob

Comment about this Job

PrioJob

tinyblob

Prior Job name when migrated

The Job Table contains one record for each Job run by Bacula. Thus normally, there will be one per day per machine added to the database. Note, the JobId is used to index Job records in the database, and it often is shown to the user in the Console program. However, care must be taken with its use as it is not unique from database to database. For example, the user may have a database for Client data saved on machine Rufus and another database for Client data saved on machine Roxie. In this case, the two database will each have JobIds that match those in another database. For a unique reference to a Job, see Job below.

The Name field of the Job record corresponds to the Name resource record given in the Director’s configuration file. Thus it is a generic name, and it will be normal to find many Jobs (or even all Jobs) with the same Name.

The Job field contains a combination of the Name and the schedule time of the Job by the Director. Thus for a given Director, even with multiple Catalog databases, the Job will contain a unique name that represents the Job.

For a given Storage daemon, the VolSessionId and VolSessionTime form a unique identification of the Job. This will be the case even if multiple Directors are using the same Storage daemon.

The Job Type (or simply Type) can have one of the following values:

Table 9.4: Job Types

Value

Meaning

B

Backup Job

M

Migrated Job

V

Verify Job

R

Restore Job

U

Console program (not in database)

I

Internal or system Job

D

Admin Job

A

Archive Job (not implemented)

C

Copy of a Job

c

Copy Job

g

Migration Job

Note, the Job Type values in table 9.4 noted above are not kept in an SQL table.

The JobStatus field specifies how the job terminated, and can be one of the following:

Table 9.5: Job Statuses

Value

Meaning

C

Created but not yet running

R

Running

B

Blocked

T

Terminated normally

W

Terminated normally with warnings

E

Terminated in Error

e

Non-fatal error

f

Fatal error

D

Verify Differences

A

Canceled by the user

I

Incomplete Job

F

Waiting on the File daemon

S

Waiting on the Storage daemon

m

Waiting for a new Volume to be mounted

M

Waiting for a Mount

s

Waiting for Storage resource

j

Waiting for Job resource

c

Waiting for Client resource

d

Wating for Maximum jobs

t

Waiting for Start Time

p

Waiting for higher priority job to finish

i

Doing batch insert file records

a

SD despooling attributes

l

Doing data despooling

L

Committing data (last despool)

Table 9.6: Filesets table layout

Column Name

Data Type

Remark

FileSetId

integer

Primary Key

FileSet

tinyblob

FileSet name

MD5

tinyblob

MD5 checksum of FileSet

CreateTime

datetime

Time and date Fileset created

The FileSet table contains one entry for each FileSet that is used. The MD5 signature is kept to ensure that if the user changes anything inside the FileSet, it will be detected and the new FileSet will be used. This is particularly important when doing an incremental update. If the user deletes a file or adds a file, we need to ensure that a Full backup is done prior to the next incremental.

Table 9.7: JobMedia table layout

Column Name

Data Type

Remark

JobMediaId

integer

Primary Key

JobId

integer

Link to Job Record

MediaId

integer

Link to Media Record

FirstIndex

integer

The index (sequence number) of the first file written for this Job to the Media

LastIndex

integer

The index of the last file written for this Job to the Media

StartFile

integer

The physical media (tape) file number of the first block written for this Job

EndFile

integer

The physical media (tape) file number of the last block written for this Job

StartBlock

integer

The number of the first block written for this Job

EndBlock

integer

The number of the last block written for this Job

VolIndex

integer

The Volume use sequence number within the Job

The JobMedia table contains one entry at the following: start of the job, start of each new tape file, start of each new tape, end of the job. Since by default, a new tape file is written every 2GB, in general, you will have more than 2 JobMedia records per Job. The number can be varied by changing the “Maximum File Size” specified in the Device resource. This record allows Bacula to efficiently position close to (within 2GB) any given file in a backup. For restoring a full Job, these records are not very important, but if you want to retrieve a single file that was written near the end of a 100GB backup, the JobMedia records can speed it up by orders of magnitude by permitting forward spacing files and blocks rather than reading the whole 100GB backup.

Table 9.8: Media table layout

Column Name

Data Type

Remark

MediaId

integer

Primary Key

VolumeName

tinyblob

Volume name

Slot

integer

Autochanger Slot number or zero

PoolId

integer

Link to Pool Record

MediaType

tinyblob

The MediaType supplied by the user

MediaTypeId

integer

The MediaTypeId

LabelType

tinyint

The type of label on the Volume

FirstWritten

datetime

Time / date when first written

LastWritten

datetime

Time/date when last written

LabelDate

datetime

Time/date when tape labeled

VolJobs

integer

Number of jobs written to this media

VolFiles

integer

Number of files written to this media

VolBlocks

integer

Number of blocks written to this media

VolMounts

integer

Number of time media mounted

VolBytes

bigint

Number of bytes saved in Job

VolParts

integer

The number of parts for a Volume (DVD)

VolErrors

integer

Number of errors during Job

VolWrites

integer

Number of writes to media

MaxVolBytes

bigint

Maximum bytes to put on this media

VolCapacityBytes

bigint

Capacity estimate for this volume

VolStatus

enum

Status of media: Full, Archive, Append, Recycle, Read-Only, Disabled, Error, Busy

Enabled

tinyint

Whether or not Volume can be written

Recycle

tinyint

Whether or not Bacula can recycle the Volumes: <yes/no>

ActionOnPurge

tinyint

What happens to a Volume after purging

VolRetention

bigint

64 bit seconds until expiration

VolUseDuration

bigint

64 bit seconds volume can be used

MaxVolJobs

integer

maximum jobs to put on Volume

MaxVolFiles

integer

maximume EOF marks to put on Volume

InChanger

tinyint

Whether or not Volume in autochanger

StorageId

integer

Storage record ID

DeviceId

integer

Device record ID

MediaAddressing

integer

Method of addressing media

VolReadTime

bigint

Time Reading Volume

VolWriteTime

bigint

Time Writing Volume

EndFile

integer

End File number of Volume

EndBlock

integer

End block number of Volume

LocationId

integer

Location record ID

RecycleCount

integer

Number of times recycled

InitialWrite

datetime

When Volume first written

ScratchPoolId

integer

Id of Scratch Pool

RecyclePoolId

integer

Pool ID where to recycle Volume

Comment

blob

User text field

The Volume Table 1 contains one entry for each volume, that is each tape, cassette (8mm, DLT, DAT, …), or file on which information is or was backed up. There is one Volume record created for each of the NumVols specified in the Pool resource record.

Table 9.9: Pool table layout

Column Name

Data Type

Remark

PoolId

integer

Primary Key

Name

Tinyblob

Pool Name

NumVols

Integer

Number of Volumes in the Pool

MaxVols

Integer

Maximum Volumes in the Pool

UseOnce

tinyint

Use volume once

UseCatalog

tinyint

Set to use catalog

AcceptAnyVolume

tinyint

Accept any volume from Pool

VolRetention

bigint

64 bit seconds to retain volume

VolUseDuration

bigint

64 bit seconds volume can be used

MaxVolJobs

integer

max jobs on volume

MaxVolFiles

integer

max EOF marks to put on Volume

MaxVolBytes

bigint

max bytes to write on Volume

MaxPoolBytes

bigint

max bytes to write on the Pool

AutoPrune

tinyint

<yes/no> for autopruning

Recycle

tinyint

<yes/no> for allowing auto recycling of Volume

ActionOnPurge

tinyint

Default Volume ActionOnPurge

PoolType

enum

Backup, Copy, Cloned, Archive, Migration

LabelType

tinyint

Type of label ANSI / Bacula

LabelFormat

Tinyblob

Label format

Enabled

tinyint

Whether or not Volume can be written

ScratchPoolId

integer

Id of Scratch Pool

RecyclePoolId

integer

Pool ID where to recycle Volume

NextPoolId

integer

Pool ID of next Pool

MigrationHighBytes

bigint

High water mark for migration

MigrationLowBytes

bigint

Low water mark for migration

MigrationTime

bigint

Time before migration

The Pool Table contains one entry for each media pool controlled by Bacula in this database. One media record exists for each of the NumVols contained in the Pool. The PoolType is a Bacula defined keyword. The MediaType is defined by the administrator, and corresponds to the MediaType specified in the Director’s Storage definition record. The CurrentVol is the sequence number of the Media record for the current volume.

Table 9.10: Client table layout

Column Name

Data Type

Remark

ClientId

integer

Primary Key

Name

TinyBlob

File Services Name

UName

TinyBlob

uname -a from Client (not yet used)

AutoPrune

tinyint

<yes/no> for autopruning

FileRetention

bigint

64 bit seconds to retain Files

JobRetention

bigint

64 bit seconds to retain Job

The Client table contains one entry for each machine backed up by Bacula in this database. Normally the Name is a fully qualified domain name.

Table 9.11: Storage table layout

Column Name

Data Type

Remark

StorageId

integer

Unique Id

Name

tinyblob

Resource name of Storage device

AutoChanger

tinyint

Set if it is an autochanger

The Storage Table contains one entry for each Storage used.

Table 9.12: Counter table layout

Column Name

Data Type

Remark

Counter

tinyblob

Counter name

MinValue

integer

Start/Min value for counter

MaxValue

integer

Max value for counter

CurrentValue

integer

Current counter value

WrapCounter

tinyblob

Name of another counter

The Counter Table contains one entry for each permanent counter defined by the user.

Table 9.13: Jobhisto table layout

Column Name

Data Type

Remark

JobId

integer

Primary Key

Job

tinyblob

Unique Job Name

Name

tinyblob

Job Name

Type

binary(1)

Job Type: Backup, Copy, Clone, Archive, Migration

Level

binary(1)

Job Level

ClientId

integer

Client index

JobStatus

binary(1)

Job Termination Status

SchedTime

datetime

Time/date when Job scheduled

StartTime

datetime

Time/date when Job started

EndTime

datetime

Time/date when Job ended

RealEndTime

datetime

Time/date when original Job ended

JobTDate

bigint

Start day in Unix format but 64 bits; used for Retention period.

VolSessionId

integer

Unique Volume Session ID

VolSessionTime

integer

Unique Volume Session Time

JobFiles

integer

Number of files saved in Job

JobBytes

bigint

Number of bytes saved in Job

ReadBytes

bigint

Number of bytes read in Job

JobErrors

integer

Number of errors during Job

JobMissingFiles

integer

Number of files not saved (not yet used)

PoolId

integer

Link to Pool Record

FileSetId

integer

Link to FileSet Record

PrioJobId

integer

Link to prior Job Record when migrated

PurgedFiles

tiny integer

Set when all File records purged

HasBase

tiny integer

Set when Base Job run

Reviewed

tiny integer

Set when the error is acknowledged

Comment

tinyblob

Comment about this Job

The JobHisto table is the same as the Job table, but it keeps long term statistics (i.e. it is not pruned with the Job).

Table 9.14: Log Table Layout

Column Name

Data Type

Remark

LogIdId

integer

Primary Key

JobId

integer

Points to Job record

Time

datetime

Time/date log record created

LogText

blob

Log text

The Log table contains a log of all Job output.

Table 9.15: Location table layout

Column Name

Data Type

Remark

LocationId

integer

Primary Key

Location

tinyblob

Text defining location

Cost

integer

Relative cost of obtaining Volume

Enabled

tinyint

Whether or not Volume is enabled

The Location table defines where a Volume is physically.

Table 9.16: Locationlog table layout

Column Name

Data Type

Remark

locLogIdId

integer

Primary Key

Date

datetime

Time/date log record created

MediaId

integer

Points to Media record

LocationId

integer

Points to Location record

NewVolStatus

integer

enum: Full, Archive, Append, Recycle, Purged Read-only, Disabled, Error, Busy, Used, Cleaning

Enabled

tinyint

Whether or not Volume is enabled

The Location Log table contains a log of all Job output.

Table 9.17: Version table layout

Column Name

Data Type

Remark

VersionId

integer

Primary Key

The Version table defines the Bacula database version number. Bacula checks this number before reading the database to ensure that it is compatible with the Bacula binary file.

Table 9.18: Basefiles table layout

Column Name

Data Type

Remark

BaseId

integer

Primary Key

BaseJobId

integer

JobId of Base Job

JobId

integer

Reference to Job

FileId

integer

Reference to File

FileIndex

integer

File Index number

The BaseFile table contains all the File references for a particular JobId that point to a Base file – i.e. they were previously saved and hence were not saved in the current JobId but in BaseJobId under FileId. FileIndex is the index of the file, and is used for optimization of Restore jobs to prevent the need to read the FileId record when creating the in memory tree. This record is not yet implemented.

1

Internally referred to as the Media table

Possible Next Steps

Go to Storage Media Output Format.

Go back to Catalog Services.

Go back to Developer Guide.