Google Mail
Bacula Enterprise Google Workspace Plugin can protect Google Mailboxes associated to users. It is possible to utilize advanced selection methods to decide exactly what is backed up (labels included/excluded, users included/excluded), as well as control precisely which messages to restore and where (original user’s account or another user’s account). The information protected with this service is:
Labels
System labels: Inbox, Sent, Draft, Spam …
User labels
Mails
Metadata
Contents
Attachments
Settings
AutoForwarding settings
Imap settings
Language settings
Pop settings settings
Delegates
Filters
SendAs addresses
Forwarding addresses
Mailbox backup includes the following features:
Incremental/Differential backup with Delta function:
Delta function is applied always, independently of what is the labels selection in the fileset (email_files* parameters)
MIME object (RFC 822) export:
It is possible to restore to a local label a selection of emails in the RFC 822 format. This way emails could be read or imported in any other tool at will
Attachments:
This plugin backs up the message with its attachments in a single file. This means everything will be restored when selecting a given file.
For export purposes there is a restore option to provoke the extraction of the attachments as separate files in addition to have the full MIME object.
Messages will be formatted in the catalog in order to not include sensitive information and will be included in a path like this:
/@gw/customer_id/users/user@customerdomain.com/email/labelname/AAAAXXXXDDDDDIIIII.msg
Where the message name corresponds to the message Id provided by Google GMail API
Other objects will look as follows:
/@gw/customer_id/users/user@customerdomain.com/email/settings/
addr1@customerdomain.com.mailbox.sasad
-> Send ass addressaddr2@customerdomain.com.mailbox.fwad
-> Forwarding addressaddr2@customerdomain.com.mailbox.del
-> Delegates address3838383aaadffdfdf.mailbox.fil
-> Filtersettings.mailbox.autfw
-> Autoforwarding settingssettings.mailbox.imap
-> Imap settingssettings.mailbox.lan
-> Autoforwarding settingssettings.mailbox.pop
-> Autoforwarding settingssettings.mailbox.vac
-> Autoforwarding settings
Backup parameters
The list below shows the specific backup parameters that can be set up in order to control the behavior of the email module.
In order to select the email module, the common service parameter must be equals or be containing the value email.
Entities that can include mailboxes are: users.
Option |
Required |
Default |
Values |
Example |
Description |
---|---|---|---|---|---|
email_files |
No |
Strings representing existing labels for the given users separated by ‘,’ |
Inbox, Sent |
Backup only specified labels belonging to the selected users |
|
email_files_exclude |
No |
Strings representing existing labels for the given users separated by ‘,’ |
Archive, Personal |
Exclude selected labels belonging to the selected users |
|
email_files_regex_include |
No |
Valid regex |
.*Company |
Backup matching labels. Please, only provide list parameters (files + files_exclude) or regex ones. But do not try to combine them. |
|
email_files_regex_exclude |
No |
Valid regex |
.*Plan |
Exclude matching labels from the selection. Please, only provide list parameters (files + files_exclude) or regex ones. But do not try to combine them. If this is the only parameter found for selection, all elements will be included and this list will be excluded. |
|
email_settings |
No |
No |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
Yes |
Backup mailbox settings of included users |
email_spam_trash |
No |
Yes |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
No |
Backup spam and trashed messages |
email_messages_exclude_expr |
No |
String representing a valid Boolean Javascript expression regarding email message fields |
emailSubject.includes(‘private’) && !emailIsRead |
Exclude from backup all messages that match the provided expression |
|
email_messages_exclude_index_expr |
No |
String representing a valid Boolean Javascript expression regarding email message fields |
/.*private.com/.test(emailFrom) |
Exclude only from indexing (catalog email tables) messages matching the provided expression |
|
email_fields_exclude_index |
No |
String representing a list of email message fields |
emailFrom, emailSubject |
Do not store into the index (catalog email tables) the provided list of message fields |
Restore
The list below shows the subset of restore parameters that can be used to control the behavior of email module restore operations:
destination_user, destination_path, send_report, allow_duplicates, debug, foreign_container_generation
email_export, email_export_attachments_extract
Use cases
The following restore scenarios are supported:
Restore labels, emails (with their attachments) to original user or to a different user mailbox
Restore parameters implied: destination_user
Restore emails to a specific label of a user:
Restore parameters implied: destination_path
Export emails to local file system in export mode :
Restore parameters implied: destination_path, email_export
Extract attachments during the export: email_export_attachments_extract
It is possible to control whether or not duplicate elements are allowed (based on file id):
Restore parameters implied: allow_duplicates
Particularities:
If no destination_user is set, every message will be restored into its original mailbox
If no destination_path is set, every message will be restored into its original path
If the selection contains messages from several users:
Original user messages will be restored in their original location
For other users, a special label will be created with the email address of each of them, containing the full path and messages of the restored objects, unless the parameter foreign_container_generation is disabled
Restore of emails from 2 different users over a third mailbox without destination_path result in auto-generated Restore_date label containing those 2 foreign users with the restored label inside of them
Restored elements will be duplicated by default, unless allow_duplicates variable is disabled
Even when disabling that variable, messages will be checked by id. So if there is an element with the same information but different ID, it will not be considered to be a duplicate
For more details about the behavior of each parameter, please check the general section of restore parameters.
Messages exclude expressions
Bacula Systems is aware about one of many privacy concerns that may arise when tools like this Google Workspace Plugin enables the possibility to backup and restore data coming from different users, so the backup administrator can restore potentially private data at his will. Moreover, emails are usually one of the most critical items in terms of privacy.
One of many strategies this plugin offers in order to deal with that problem is the possibility to exclude messages. This is a very powerful feature where it is possible to use quite flexible expressions that allow to select a subset of messages and simply exclude them from the backup:
email_messages_exclude_expr new fileset parameter
Or only from the index (from the catalog)
email_messages_exclude_index_expr new fileset parameter
Not only messages can be excluded but also select only a subset of email fields to be included in the protected information. It is possible to exclude fields from the backup index (the catalog):
email_fields_exclude_index new fileset parameter
All four discussed expressions are based on an internal structure of fields to work with. Below you can see the entire list of fields that you can use:
emailTags
emailSubject
emailFolderName
emailFrom
emailTo
emailCc
emailBodyPreview
emailImportance
emailTime
emailIsRead
emailIsDraft
Please note that it is very important to write the fields exactly as written above.
These fields can be used in a comma separated list in the ‘email_fields_exclude’ parameter and also ‘email_fields_exclude_index’ parameter.
Then, for ‘email_messages_exclude_expr’ and ‘email_messages_exclude_index_expr’ use them in a valid boolean expression in Javascript language syntax. Some examples are provided below:
emailSubject.includes('private')
!emailIsRead && (emailIsDraft || emailFolderName == 'Private')
!emailTime < Date.parse('2012-11-01')
/.*private.com/.test(emailFrom)
Note
This feature is available since Bacula Enterprise version 14.0
Expression tester
This expression mechanism can sometimes be uncertain for end users, where they can have doubts about the correct behavior of their prepared expressions. In order to help with that, Google Workspace Plugin presents a query method that allows to test those expressions against a static pre-loaded set of data.
There are two commands available:
Show command: It will show the static data in json format, so it is possible to see the contents to adapt the expressions to test
Test command: It will apply the expression parameters to the pre-loaded static data
The test command has the following format:
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxxx admin_user_email=admin@company.com" parameter=email-expr-show
The show command has the following fomat
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxxx admin_user_email=admin@company.com email_messages_exclude_expr = \"<your-js-expression>\"" parameter=json|email-expr-test
// Or
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxxx admin_user_email=admin@company.com email_messages_exclude_index_expr = \"<your-js-expression>\"" parameter=json|email-expr-test
The test command produces some JSON output with objects with the same format that the plugin uses to store data into the catalog. Please note the ‘total’ value at the end, where the value of 12 total pre-loaded messages is shown
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxx admin_user_email=admin@company.com" parameter=json|email-expr-show
....
"email-12": {
"body": {
"content": "These are the contents in text format of the 12 email of test data. It has the following categories:orange, black, white, purpleYou can try to filter this body using any JS method like /.*12.*/.test(emailBody) or emailBody.includes(12)",
"contentType": "TEXT"
},
"ccRecipients": [
{
"emailAddress": {
"address": "danny@other.com"
}
},
{
"emailAddress": {
"address": "lucas@other.com"
}
},
{
"emailAddress": {
"address": "terese@other.com"
}
}
],
"from": {
"emailAddress": {
"address": "elon@other.com"
}
},
"hasAttachments": false,
"isDraft": false,
"isRead": false,
"replyTo": [
{
"emailAddress": {
"address": "elon@other.com"
}
}
],
"sentDateTime": {
"dateTime": {
"date": {
"year": 2021,
"month": 12,
"day": 5
},
"time": {
"hour": 11,
"minute": 30,
"second": 0,
"nano": 0
}
},
"offset": {
"totalSeconds": 0
}
},
"subject": "This is private subject 12",
"toRecipients": [
{
"emailAddress": {
"address": "laura@other.com"
}
},
{
"emailAddress": {
"address": "jack@other.com"
}
},
{
"emailAddress": {
"address": "john@other.com"
}
}
],
"categories": [
"orange",
"black",
"white",
"purple"
]
}
},
{
"total": "12"
}
The test command, on its side will produce two different outputs. The first part presents the same format as the show format, and those are the messages that would be included in the backup. The second part presents a different format, so an output like:
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxxx admin_user_email=admin@company.com" parameter=json|email-expr-show
....
{
"meta-email-12": {
"EmailId": "",
"EmailOwner": "test@test.com",
"EmailTenant": "johndoe.onmicrosoft.com",
"EmailTags": "orange,black,white,purple",
"EmailSubject": "This is private subject 12",
"EmailFolderName": "/",
"EmailFrom": "elon@other.com",
"EmailTo": "laura@other.com,jack@other.com,john@other.com",
"EmailCc": "danny@other.com,lucas@other.com,terese@other.com",
"EmailInternetMessageId": "",
"EmailBodyPreview": "",
"EmailImportance": "",
"EmailConversationId": "",
"EmailSize": 235,
"EmailIsRead": 0,
"EmailIsDraft": 0,
"EmailHasAttachment": 0,
"Type": "EMAIL",
"Version": 1,
"Plugin": "gw"
}
},
{
"total-backup": "12"
},
{
"total-index": "12"
}
That part represents the information that would be indexed in the backup (included into the catalog). You can also see the total entries at the end, this is very useful to quickly compare with the original 12 value and knowing if our expression is filtering the expected data or not. Below we provide an example where some filtering is applied to the backup, but also to the index:
.query client=127.0.0.1-fd plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxyyyy admin_user_email=jorge@company.com email_messages_exclude_expr=\"emailFrom == 'elon@other.co == 'elon@otessages_exclude_index_expr=\"emailSubject.includes('private')\"" parameter=email-expr-test
...
meta-email-4={
"EmailId": "";
"EmailOwner": "jorge@company.com";
"EmailTenant": "xxxxyyyy";
"EmailTime": "2021-08-05 12:30:00";
"EmailTags": "SENT;UNREAD;SENT;orange;black;white;purple";
"EmailSubject": "This is orange subject 8";
"EmailFolderName": "sent";
"EmailFrom": "bob@company.com";
"EmailTo": "john@company.com";
"EmailCc": "terese@company.com";
"EmailInternetMessageId": "1533123860.7.1655130748637@jorge-Bravo-15-Bac";
"EmailBodyPreview": "These are the contents in text format of the 8 email of test data. It has the following categories:orange; black; white; purpleYou can try to filter this body using any JS method like /.*8.*/.test(emailBody) or emailBody.includes(8)";
"EmailImportance": "";
"EmailConversationId": "";
"EmailIsRead": 1;
"EmailIsDraft": 0;
"EmailHasAttachment": 0;
"Type": "EMAIL";
"Version": 1;
"Plugin": "gw"
}
total-backup=6
total-index=4
- In case your expression is not valid, the plugin will also inform about that with the following message:
error=Error listing elements. Cause: Predicate test error!! Review your query ….
Fileset examples
Backup Full MailBox of some users, but excluding some labels:
FileSet {
Name = fs-gw-drive-adjon-users-notemp
Include {
Options { signature = MD5 }
Plugin = "gw: service=email credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=G39add31l1 admin_user_email=super@baculasystems.com
user=\"adelev@baculasystems.com,jonis@baculasystems.com\" email_files_exclude=\"*.temporary\""
}
}
Backup all MailBoxes:
FileSet {
Name = fs-gw-email-all
Include {
Options { signature = MD5 }
Plugin = "gw: service=email credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=G39add31l1 admin_user_email=super@baculasystems.com"
}
}
Backup only the Inbox label of some users:
FileSet {
Name = fs-gw-email-2user-inbox
Include {
Options { signature = MD5 }
Plugin = "gw: service=email credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=G39add31l1 admin_user_email=super@baculasystems.com
user="peter@baculasystems.com,john@baculasystems.com" email_files=inbox"
}
}
Backup some users and include settings:
FileSet {
Name = fs-gw-email-2user-mime
Include {
Options { signature = MD5 }
Plugin = "gw: service=email credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=G39add31l1 admin_user_email=super@baculasystems.com
user="peter@baculasystems.com,miriam@baculasystems.com" email_settings=yes"
}
}
System labels
Google Mail can present the labels information in local languages to the user.
In general, there is no ‘multilanguage’ support, in the sense that labels must be included with their original name. For example, if you create a label named ‘books’, you cannot expect it to be backed up if you use something like ‘livres’ or ‘libros’ from other languages. You need to use the real name that was used to create such label.
There is one very important special case though, which is ‘system labels’. System labels are labels like ‘inbox’, ‘sent’ … A full list can be found here: https://developers.google.com/gmail/api/guides/labels
These kind of labels can be ‘found’ by the plugin using their standard name, instead of their internal id, as it’s the general case. Therefore, for them it is possible to get the label using their English well known word even if the user sees the label with a translated word.
For example, to backup inbox it is needed to use ‘inbox’ even if for some users it is ‘Posteingang’ or ‘boîte de réception’. Google Workspace Plugin will recognize these special words and will query the information through them.
To summarize:
System labels -> Use English word
Other user labels -> Use original name