Alerts
On top of the reporting features described in the Reports section BGuardian also implements an Alert framework that will keep track of those issues through executions of the tool through the time. The purpose of this framework is to easy the management of the different issues and to provide the proper functions to see the status of an environment in a given point in time when BGuardian is run regularly.
Alerts will group the information of the issues by bigger entities. For example, the same job can present many records inside the deviation service, with different executions of it. In the issue report we will see individually each entry, while in the alerts framework we will only have a single alert for that jobname, grouping all the affected executions.
BGuardian allows to precisely select the services that are desired to be run. However, the backup administrator will find situations where the service is still interesting to be run, while there are some records that he/she acknowledged already and does not desire to see anymore in the reports or in the alert lists. Here is where the ignore feature comes in place. Using the code of a given issue (or part of it in some situations) it is possible to mark an issue or a group of them to be ignored in further executions.
In this section, the alerts structure, the different commands and soem examples of them are shown
Alerts structure
Alerts are store in the path defined by ‘alerts_path’ parameter. Each alert is represented by .json file, whose name is the code of the alert and the kind of it.
$ ls -l
total 56
-rw-rw-r-- 1 bac bac 343 jun 16 12:31 GC__CATA.on.json
-rw-rw-r-- 1 bac bac 342 jun 16 12:31 GC__CONF.on.json
-rw-rw-r-- 1 bac bac 396 jun 16 12:31 GC__CONS.on.json
-rw-rw-r-- 1 bac bac 324 jun 16 12:31 GC__COPY.on.json
-rw-rw-r-- 1 bac bac 394 jun 16 12:31 GC__EVEN.on.json
-rw-rw-r-- 1 bac bac 401 jun 16 12:31 GC__MALW.on.json
-rw-rw-r-- 1 bac bac 482 jun 16 12:31 GC__PASS.on.json
-rw-rw-r-- 1 bac bac 371 jun 16 12:31 GC__PERM.on.json
-rw-rw-r-- 1 bac bac 312 jun 16 12:31 GC__REST.on.json
-rw-rw-r-- 1 bac bac 406 jun 16 12:31 GC__SECU.on.json
-rw-rw-r-- 1 bac bac 310 jun 16 12:31 GC__VERI.on.json
-rw-rw-r-- 1 bac bac 1298 jun 16 12:31 GDV__guardianjob.on.json
-rw-rw-r-- 1 bac bac 219 jun 16 12:31 GRF__guardianjob.on.json
-rw-rw-r-- 1 bac bac 239 jun 16 12:31 GSR__guardianjob.on.json
The ‘on.json’ extension represents an active alert, while the ‘off.json’ extension represents an alert that will be ignored in future executions.
List alerts
The following command lists active alerts:
# Text mode
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian list_text
# Json mode
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian list
Example output in text format:
GC__CONF | LOW | Service: configurationsecurity | Entity: config_backup | Details: {"CONFIG_BACKUP":{"passed":false,"description":"Catalog Backup Job executions","details":"Catalog Backup Job was not run last 10 days","subCheck":"CONFIG_BACKUP","severity":"HIGH"}}
GC__CONS | LOW | Service: configurationsecurity | Entity: consoles | Details: {"CONSOLES":{"passed":false,"description":"Restricted consoles","details":"No restricted console was found. If external connections are allowed, it is recommended to use restricted consoles for them","subCheck":"CONSOLES","severity":"LOW"}}
The structure of an alert is similar to the structure of an issue. It is composed by:
Code | Severity | BGuardian service | Affected entity | Issue details
The following command lists ignore alerts:
# Text mode
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian list_ignore_text
# Json mode
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian list_ignore
Ignore alerts
The following command adds ‘ignores’, which means that issues matching the ignore code will be ignored from active alerts and from further BGuardian executions.
# One single code
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian ignore code1
# Several codes at once
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian ignore code1, code2, code3
Using the codes shown in list alerts, we could ignore them with the following commands
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian ignore GC__CONF, GC__CONS
There are services that have entities, but also more information. For example the deviation service works with jobnames and with job ids. Example:
############### Service: Deviation ###############
HIGH | GDV__GuardianJob__9 | Job: 9 GuardianJob F 2023-06-16 00:00:00
| Size (read): 142,39 MiB +4080,1% | Size (write): 142,43 MiB
| Files: 4,02 K | Duration: 3s
| Executions: 5 | Average: 3,41 MiB r | 3,41 MiB w - 819 files - 0s
| Details: Significant deviations found: Job size (bytes read) increased 4080,1% from the expected estimated value of: 3,41 MiB.
It is possible to ignore here any execution of the GuardianJob. To do so:
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian ignore GDV__GuardianJob
However, it is also possible to ignore only particular jobids:
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian ignore GDV__GuardianJob__9
Doing so, new deviated results of the same job will be still considered in next executions.
Manually remove alerts
It is possible to remove active alerts using a very similar format:
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian remove_alert GDV__GuardianJob__9
To remove something from the ignore list:
sudo -u bacula /bin/bash /opt/bacula/bin/bguardian remove_ignore GC__CATA
Alerts recovery
When a situation marked by a given issue is solved, BGuardian will automatically remove the associated alert.
Events
In order to notify the backup administrator when an alert is created or recovered, BGuardian uses the Events feature. This means it will send an Message of type Event with the information of what happened.
Example of BGuardian event about permissions recovery (source ‘bguardian’):
*list events
Automatically selected Catalog: MyCatalog
Using Catalog "MyCatalog"
+---------------------+---------------+---------------+-----------------------+------------------------------------------+
| time | daemon | source | type | events |
+---------------------+---------------+---------------+-----------------------+------------------------------------------+
| 2023-06-16 13:20:57 | 127.0.0.1-dir | *Director* | daemon | Director configuration reloaded |
| 2023-06-16 13:21:34 | 127.0.0.1-dir | *Console* | connection | Connection from 127.0.0.1:8101 |
| 2023-06-16 13:21:34 | 127.0.0.1-dir | **bguardian** | configurationsecurity | BGuardian alert [GC__PERM] was recovered |
| 2023-06-16 13:21:34 | 127.0.0.1-dir | *Console* | connection | Disconnection from 127.0.0.1:8101 |
| 2023-06-16 13:21:37 | 127.0.0.1-dir | *Console* | connection | Connection from 127.0.0.1:8101 |
+---------------------+---------------+---------------+-----------------------+------------------------------------------+
Note that in order to have events feature working, it is needed to enable them in the Message resources as discussed in the Configuration section.
Go back to the main operations page.