Client Troubleshooting

I’m Getting Authorization Errors. What is Going On?

For security reasons, Bacula requires that both the File daemon and the Storage daemon know the name of the Director as well as its password. As a consequence, if you change the Director’s name or password, you must make the corresponding change in the Storage daemon’s and in the File daemon’s configuration files.

During the authorization process, the Storage daemon and File daemon also require that the Director authenticates itself, so both ends require the other to have the correct name and password.

If you have edited the configuration files and modified any name or any password, and you are getting authentication errors, then your best bet is to go back to the original configuration files generated by the Bacula installation process. Make only the absolutely necessary modifications to these files – e.g. add the correct email address.

Another reason that you can get authentication errors is if you are running Multiple Concurrent Jobs in the Director, but you have not set them in the File daemon or the Storage daemon. Once you reach their limit, they will reject the connection producing authentication (or connection) errors.

If you are having problems connecting to a Windows machine that previously worked, you might try restarting the Bacula service.

Some users report that authentication fails if there is not a proper reverse DNS lookup entry for the machine. This seems to be a requirement of gethostbyname(), which is what Bacula uses to translate names into IP addresses. If you cannot add a reverse DNS entry, or you don’t know how to do so, you can avoid the problem by specifying an IP address rather than a machine name in the appropriate Bacula conf file.

Here is a picture that indicates what names/passwords in which files/Resources must match up:

Authorization Diagram

Configuration Diagram

In the center column, you will find the Director, Storage, and Client resources, with their names and passwords – these are all in bacula-dir.conf. The left column is where the corresponding values should be found in the Console and right column is where the corresponding values should be found in the Storage daemon (SD), and File daemon (FD) configuration files.

Another thing to check is to ensure that the Bacula component you are trying to access has Maximum Concurrent Jobs set large enough to handle each of the Jobs and the Console that want to connect simultaneously. Once the maximum connections has been reached, each Bacula component will reject all new connections.

Please also remember that File daemons with later versions than the Director and Storage daemons are not supported and can result in authorization errors.

Finally, make sure you have no hosts.allow or hosts.deny file that is not permitting access to the site trying to connect.

Bacula Runs Fine but Cannot Access a Client on a Different Machine. Why?

There are several reasons why Bacula could not contact a client on a different machine. They are:

  • It is a Windows Client, and the client died because of an improper configuration file. Check that the Bacula icon is in the system tray and the the menu items work. If the client has died, the icon will disappear only when you move the mouse over the icon.

  • The Client address or port is incorrect or not resolved by DNS. See if you can ping the client machine using the same address as in the Client record.

  • You have a firewall, and it is blocking traffic on port 9102 between the Director’s machine and the Client’s machine (or on port 9103 between the Client and the Storage daemon machines).

  • Your password or names are not correct in both the Director and the Client machine. Try configuring everything identical to how you run the client on the same machine as the Director, but just change the Address. If that works, make the other changes one step at a time until it works.

  • You may also be having problems between your File daemon and your Storage daemon. The name you use in the Storage resource of your Director’s conf file must be known (resolvable) by the File daemon, because it is passed symbolically to the File daemon, which then resolves it to get an IP address used to contact the Storage daemon.

  • You may have a hosts.allow or hosts.deny file that is not permitting access.

I’m Having VSS Problems on Windows FD. What Should I do?

If you are experiencing problems such as VSS hanging on MSDE, first try running vssadmin to check for problems, then try running ntbackup which also uses VSS to see if it has similar problems. If so, you know that the problem is in your Windows machine and not with Bacula.

The FD hang problems were reported with MSDEwriter when:

  • a local firewall locked local access to the MSDE TCP port (MSDEwriter seems to use TCP/IP and not Named Pipes).

  • msdtcs was installed to run under “localsystem”: try running msdtcs under networking account (instead of local system) (com+ seems to work better with this configuration).

I’m Having Restore Problems on Windows FD. What Should I do?

If during a Backup, you get the message: ERR=Access is denied and you are using the portable option, you should try both adding both the non-portable (backup API) and the Volume Shadow Copy options to your Director’s conf file.

In the Options resource:

portable = no

In the FileSet resource:

EnableVSS = yes

In general, specifying these two options should allow you to backup any file on a Windows system. However, in some cases, if users have allowed to have full control of their folders, even system programs such a Bacula can be locked out. In this case, you must identify which folders or files are creating the problem and do the following:

  1. Grant ownership of the file/folder to the Administrators group, with the option to replace the owner on all child objects.

  2. Grant full control permissions to the Administrators group, and change the user’s group to only have Modify permission to the file/folder and all child objects.

I don’t know if the FD on Windows opened the port and is listening. What to do?

If you want to see if the File Daemon has properly opened the port and is listening, you can enter the following command in a shell window:

netstat -an | findstr 910[123]

is another program that has been recommend, but it is not a standard Windows program, so you must find and download it from the Internet.

How to Fix the Windows Boot Record?

An effective way to restore a Windows backup for those who do not purchase the bare metal restore capability is to install Windows on a different hard drive and restore the backup. Then run the recovery CD and run:

diskpart
    select disk 0
    select part 1
    active
    exit
bootrec  /rebuldbcd
bootrec  /fixboot
bootrec  /fixmbr

Why Do I Have Slow Backup on Windows Machines?

Sometimes Windows machines the File Daemon may have very slow backup transfer rates compared to other machines. To you might try setting the Maximum Network Buffer Size to 32,768 in both the File daemon and in the Storage daemon. The default size is larger, and apparently some Windows ethernet controllers do not deal with a larger network buffer size.

Many Windows ethernet drivers have a tendency to either run slowly due to old broken firmware, or because they are running in half-duplex mode. Please check with the ethernet card manufacturer for the latest firmware and use whatever techniques are necessary to ensure that the card is running in duplex.

I’m Experiencing Problems with Opening the Files (Windows). What to do?

If you are not using the portable option, and you have VSS (Volume Shadow Copy) enabled in the Director, and you experience problems with Bacula not being able to open files, it is most likely that you are running an antivirus program that blocks Bacula from doing certain operations. In this case, disable the antivirus program and try another backup. If it succeeds, either get a different (better) antivirus program or use something like RunClientJobBefore/After to turn off the antivirus program while the backup is running.

If turning off anti-virus software does not resolve your VSS problems, you might have to turn on VSS debugging. The following link describes how to do this: support.microsoft.com/kb/887013/en-us.

In Microsoft Windows Small Business Server 2003 the VSS Writer for Exchange is turned off by default. To turn it on, see the following link: support.microsoft.com/default.aspx?scid=kb;EN-US;Q838183.

The most likely source of problems is authentication when the Director attempts to connect to the File Daemon that you installed. This can occur if the names and the passwords defined in the File Daemon’s configuration file bacula-fd.conf file on the Windows machine do not match with the names and the passwords in the Director’s configuration file bacula-dir.conf located on your Unix/Linux server.

More specifically, the password found in the Client resource in the Director’s configuration file must be the same as the password in the Director resource of the File daemon’s configuration file. In addition, the name of the Director resource in the File daemon’s configuration file must be the same as the name in the Director resource of the Director’s configuration file.

It is a bit hard to explain in words, but if you understand that a Director normally has multiple Clients and a Client (or File Daemon) may permit access by multiple Directors, you can see that the names and the passwords on both sides must match for proper authentication.

One user had serious problems with the configuration file until he realized that the Unix end of line conventions were used and Bacula wanted them in Windows format. This has not been confirmed though, and Bacula version 2.0.0 and above should now accept all end of line conventions (Windows, Unix, Mac).

Running Unix like programs on Windows machines is a bit frustrating because the Windows command line shell (DOS Window) is rather primitive. As a consequence, it is not generally possible to see the debug information and certain error messages that Bacula prints. With a bit of work, however, it is possible. When everything else fails and you want to see what is going on, try the following:

Start a DOS shell Window.
c:Files-fd -t >out
type out

The precise path to bacula-fd depends on where it is installed. The -t option will cause Bacula to read the configuration file, print any error messages and then exit. the redirects the output to the file named out, which you can list with the type command.

If something is going wrong later, or you want to run Bacula with a debug option, you might try starting it as:

c:Files-fd -d 100 >out

In this case, Bacula will run until you explicitly stop it, which will give you a chance to connect to it from your Unix/Linux server. When you start the File daemon in debug mode it can write the output to a trace file bacula.trace in the current directory. To enable this, before running a job, use the console, and enter:

trace on

then run the job, and once you have terminated the File daemon, you will find the debug output in the bacula.trace file, which will probably be located in the same directory as bacula-fd.exe.

In addition, you should look in the System Applications log on the Control Panel to find any Windows errors that Bacula got during the startup process.

Finally, due to the above problems, when you turn on debugging, and specify trace=1 on a setdebug command in the Console, Bacula will write the debug information to the file bacula.trace in the directory from which Bacula is executing.

My ClientRunBeforeJob Scripts are Randomly Dying (Windows). What Should I Do?

If you are having problems with ClientRunBeforeJob scripts randomly dying, it is possible that you have run into an Oracle bug. See bug number 622 in the bugs.bacula.org database. The following information has been provided by a user on this issue:

The information in this document applies to:
Oracle HTTP Server - Version: 9.0.4
Microsoft Windows Server 2003
Symptoms
When starting an OC4J instance, the System Clock runs faster, about 7
seconds per minute.
Cause
+ This is caused by the Sun JVM bug 4500388, which states that "Calling
Thread.sleep() with a small argument affects the system clock". Although
this is reported as fixed in JDK 1.4.0_02, several reports contradict this
(see the bug in
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4500388).
+ Also reported by Microsoft as "The system clock may run fast when you
use the ACPI power management timer as a high-resolution counter on Windows
2000-based computers" (See http://support.microsoft.com/?id=821893)

You may wish to start the daemon with debug mode on rather than doing it using bconsole. To do so, edit the following registry key:

HKEY_LOCAL_MACHINE-dir

using regedit, then add -dnn after the /service option, where nn represents the debug level you want.

My Linux Client Immediately Dies When I Start It

The most common problem is either that the configuration file is not where it expects it to be, or that there is an error in the configuration file. The location of your configuration file should be /opt/bacula/etc/ for Bacula Enterprise and /etc/bacula/ by default for Bacula Community.

To see what is going on when the File daemon starts on Linux, do the following:

/opt/bacula/bin/bacula-fd -d100 -c /opt/bacula/etc/bacula-fd.conf

or

/etc/bacula/bin/bacula-fd -d100 -c /etc/bacula/bacula-fd.conf

This will cause the FD to write the debug output to the command line, which you can examine and thereby determine the problem.

Go back to the Bacula Enterprise Troubleshooting chapter.