Google Mail

Bacula Enterprise Google Workspace Plugin can protect Google Mailboxes associated to users. It is possible to utilize advanced selection methods to decide exactly what is backed up (labels included/excluded, users included/excluded), as well as control precisely which messages to restore and where (original user’s account or another user’s account). The information protected with this service is:

  • Labels

    • System labels: Inbox, Sent, Draft, Spam …

    • User labels

  • Mails

    • Metadata

    • Contents

  • Attachments

  • Settings

    • AutoForwarding settings

    • Imap settings

    • Language settings

    • Pop settings settings

  • Delegates

  • Filters

  • SendAs addresses

  • Forwarding addresses

Mailbox backup includes the following features:

  • Incremental/Differential backup with Delta function:

    • Delta function is applied always, independently of what is the labels selection in the fileset (email_files* parameters)

  • MIME object (RFC 822) export:

    • It is possible to restore to a local label a selection of emails in the RFC 822 format. This way emails could be read or imported in any other tool at will

  • Attachments:

    • This plugin backs up the message with its attachments in a single file. This means everything will be restored when selecting a given file.

    • For export purposes there is a restore option to provoke the extraction of the attachments as separate files in addition to have the full MIME object.

Messages will be formatted in the catalog in order to not include sensitive information and will be included in a path like this:

  • /@gw/customer_id/users/user@customerdomain.com/email/labelname/AAAAXXXXDDDDDIIIII.msg

    • Where the message name corresponds to the message Id provided by Google GMail API

Other objects will look as follows:

  • /@gw/customer_id/users/user@customerdomain.com/email/settings/

    • addr1@customerdomain.com.mailbox.sasad -> Send ass address

    • addr2@customerdomain.com.mailbox.fwad -> Forwarding address

    • addr2@customerdomain.com.mailbox.del -> Delegates address

    • 3838383aaadffdfdf.mailbox.fil -> Filter

    • settings.mailbox.autfw -> Autoforwarding settings

    • settings.mailbox.imap -> Imap settings

    • settings.mailbox.lan -> Autoforwarding settings

    • settings.mailbox.pop -> Autoforwarding settings

    • settings.mailbox.vac -> Autoforwarding settings

Backup parameters

The list below shows the specific backup parameters that can be set up in order to control the behavior of the email module.

In order to select the email module, the common service parameter must be equals or be containing the value email.

Entities that can include mailboxes are: users.

Option

Required

Default

Values

Example

Description

email_files

No

Strings representing existing labels for the given users separated by ‘,’

Inbox, Sent

Backup only specified labels belonging to the selected users

email_files_exclude

No

Strings representing existing labels for the given users separated by ‘,’

Archive, Personal

Exclude selected labels belonging to the selected users

email_files_regex_include

No

Valid regex

.*Company

Backup matching labels. Please, only provide list parameters (files + files_exclude) or regex ones. But do not try to combine them.

email_files_regex_exclude

No

Valid regex

.*Plan

Exclude matching labels from the selection. Please, only provide list parameters (files + files_exclude) or regex ones. But do not try to combine them. If this is the only parameter found for selection, all elements will be included and this list will be excluded.

email_settings

No

No

0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on

Yes

Backup mailbox settings of included users

email_spam_trash

No

Yes

0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on

No

Backup spam and trashed messages

email_messages_exclude_expr

No

String representing a valid Boolean Javascript expression regarding email message fields

emailSubject.includes(‘private’) && !emailIsRead

Exclude from backup all messages that match the provided expression

email_messages_exclude_index_expr

No

String representing a valid Boolean Javascript expression regarding email message fields

/.*private.com/.test(emailFrom)

Exclude only from indexing (catalog email tables) messages matching the provided expression

email_fields_exclude_index

No

String representing a list of email message fields

emailFrom, emailSubject

Do not store into the index (catalog email tables) the provided list of message fields

Restore

The list below shows the subset of restore parameters that can be used to control the behavior of email module restore operations:

  • destination_user, destination_path, send_report, allow_duplicates, debug, foreign_container_generation

  • email_export, email_export_attachments_extract

Use cases

The following restore scenarios are supported:

  • Restore labels, emails (with their attachments) to original user or to a different user mailbox

    • Restore parameters implied: destination_user

  • Restore emails to a specific label of a user:

    • Restore parameters implied: destination_path

  • Export emails to local file system in export mode :

    • Restore parameters implied: destination_path, email_export

    • Extract attachments during the export: email_export_attachments_extract

  • It is possible to control whether or not duplicate elements are allowed (based on file id):

    • Restore parameters implied: allow_duplicates

Particularities:

  • If no destination_user is set, every message will be restored into its original mailbox

  • If no destination_path is set, every message will be restored into its original path

    • If the selection contains messages from several users:

      • Original user messages will be restored in their original location

      • For other users, a special label will be created with the email address of each of them, containing the full path and messages of the restored objects, unless the parameter foreign_container_generation is disabled

    Restore of emails from 2 different users over a third mailbox without destination_path result in auto-generated Restore_date label containing those 2 foreign users with the restored label inside of them

  • Restored elements will be duplicated by default, unless allow_duplicates variable is disabled

    • Even when disabling that variable, messages will be checked by id. So if there is an element with the same information but different ID, it will not be considered to be a duplicate

For more details about the behavior of each parameter, please check the general section of restore parameters.

Messages exclude expressions

Bacula Systems is aware about one of many privacy concerns that may arise when tools like this Google Workspace Plugin enables the possibility to backup and restore data coming from different users, so the backup administrator can restore potentially private data at his will. Moreover, emails are usually one of the most critical items in terms of privacy.

One of many strategies this plugin offers in order to deal with that problem is the possibility to exclude messages. This is a very powerful feature where it is possible to use quite flexible expressions that allow to select a subset of messages and simply exclude them from the backup:

  • email_messages_exclude_expr new fileset parameter

Or only from the index (from the catalog)

  • email_messages_exclude_index_expr new fileset parameter

Not only messages can be excluded but also select only a subset of email fields to be included in the protected information. It is possible to exclude fields from the backup index (the catalog):

  • email_fields_exclude_index new fileset parameter

All four discussed expressions are based on an internal structure of fields to work with. Below you can see the entire list of fields that you can use:

  • emailTags

  • emailSubject

  • emailFolderName

  • emailFrom

  • emailTo

  • emailCc

  • emailBodyPreview

  • emailImportance

  • emailTime

  • emailIsRead

  • emailIsDraft

Please note that it is very important to write the fields exactly as written above.

These fields can be used in a comma separated list in the ‘email_fields_exclude’ parameter and also ‘email_fields_exclude_index’ parameter.

Then, for ‘email_messages_exclude_expr’ and ‘email_messages_exclude_index_expr’ use them in a valid boolean expression in Javascript language syntax. Some examples are provided below:

Expression to exclude messages where subject includes the word ‘private’
emailSubject.includes('private')
Complex expression to exclude messages that are not read and are Draft or their label name is named Private
!emailIsRead && (emailIsDraft || emailFolderName == 'Private')
Expression to exclude messages based on the received or sent date
!emailTime < Date.parse('2012-11-01')
Expression to exclude messages using a regex based on emailFrom
/.*private.com/.test(emailFrom)

Note

This feature is available since Bacula Enterprise version 14.0

Expression tester

This expression mechanism can sometimes be uncertain for end users, where they can have doubts about the correct behavior of their prepared expressions. In order to help with that, Google Workspace Plugin presents a query method that allows to test those expressions against a static pre-loaded set of data.

There are two commands available:

  • Show command: It will show the static data in json format, so it is possible to see the contents to adapt the expressions to test

  • Test command: It will apply the expression parameters to the pre-loaded static data

The test command has the following format:

Expression tester Show command
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxxx admin_user_email=admin@company.com" parameter=email-expr-show

The show command has the following fomat

Expression tester Test command
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxxx admin_user_email=admin@company.com email_messages_exclude_expr = \"<your-js-expression>\"" parameter=json|email-expr-test
// Or
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxxx admin_user_email=admin@company.com email_messages_exclude_index_expr = \"<your-js-expression>\"" parameter=json|email-expr-test

The test command produces some JSON output with objects with the same format that the plugin uses to store data into the catalog. Please note the ‘total’ value at the end, where the value of 12 total pre-loaded messages is shown

Expression tester Show command output
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxx admin_user_email=admin@company.com" parameter=json|email-expr-show
....
    "email-12": {
      "body": {
        "content": "These are the contents in text format of the 12 email of test data. It has the following categories:orange, black, white, purpleYou can try to filter this body using any JS method like /.*12.*/.test(emailBody) or emailBody.includes(12)",
        "contentType": "TEXT"
      },
      "ccRecipients": [
        {
          "emailAddress": {
            "address": "danny@other.com"
          }
        },
        {
          "emailAddress": {
            "address": "lucas@other.com"
          }
        },
        {
          "emailAddress": {
            "address": "terese@other.com"
          }
        }
      ],
      "from": {
        "emailAddress": {
          "address": "elon@other.com"
        }
      },
      "hasAttachments": false,
      "isDraft": false,
      "isRead": false,
      "replyTo": [
        {
          "emailAddress": {
            "address": "elon@other.com"
          }
        }
      ],
      "sentDateTime": {
        "dateTime": {
          "date": {
            "year": 2021,
            "month": 12,
            "day": 5
          },
          "time": {
            "hour": 11,
            "minute": 30,
            "second": 0,
            "nano": 0
          }
        },
        "offset": {
          "totalSeconds": 0
        }
      },
      "subject": "This is private subject 12",
      "toRecipients": [
        {
          "emailAddress": {
            "address": "laura@other.com"
          }
        },
        {
          "emailAddress": {
            "address": "jack@other.com"
          }
        },
        {
          "emailAddress": {
            "address": "john@other.com"
          }
        }
      ],
      "categories": [
        "orange",
        "black",
        "white",
        "purple"
      ]
    }
  },
  {
    "total": "12"
  }

The test command, on its side will produce two different outputs. The first part presents the same format as the show format, and those are the messages that would be included in the backup. The second part presents a different format, so an output like:

Expression tester Test command, index part output
.query client=<your-fd-client> plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxxx admin_user_email=admin@company.com" parameter=json|email-expr-show
....
      {
    "meta-email-12": {
      "EmailId": "",
      "EmailOwner": "test@test.com",
      "EmailTenant": "johndoe.onmicrosoft.com",
      "EmailTags": "orange,black,white,purple",
      "EmailSubject": "This is private subject 12",
      "EmailFolderName": "/",
      "EmailFrom": "elon@other.com",
      "EmailTo": "laura@other.com,jack@other.com,john@other.com",
      "EmailCc": "danny@other.com,lucas@other.com,terese@other.com",
      "EmailInternetMessageId": "",
      "EmailBodyPreview": "",
      "EmailImportance": "",
      "EmailConversationId": "",
      "EmailSize": 235,
      "EmailIsRead": 0,
      "EmailIsDraft": 0,
      "EmailHasAttachment": 0,
      "Type": "EMAIL",
      "Version": 1,
      "Plugin": "gw"
    }
  },
  {
    "total-backup": "12"
  },
  {
    "total-index": "12"
  }

That part represents the information that would be indexed in the backup (included into the catalog). You can also see the total entries at the end, this is very useful to quickly compare with the original 12 value and knowing if our expression is filtering the expected data or not. Below we provide an example where some filtering is applied to the backup, but also to the index:

Expression tester Test command, index part output
.query client=127.0.0.1-fd plugin="gw: credentials_file=/opt/bacula/etc/gw-credentials.json customer_id=xxxxyyyy admin_user_email=jorge@company.com email_messages_exclude_expr=\"emailFrom == 'elon@other.co == 'elon@otessages_exclude_index_expr=\"emailSubject.includes('private')\"" parameter=email-expr-test
...
    meta-email-4={
     "EmailId": "";
     "EmailOwner": "jorge@company.com";
     "EmailTenant": "xxxxyyyy";
     "EmailTime": "2021-08-05 12:30:00";
     "EmailTags": "SENT;UNREAD;SENT;orange;black;white;purple";
     "EmailSubject": "This is orange subject 8";
     "EmailFolderName": "sent";
     "EmailFrom": "bob@company.com";
     "EmailTo": "john@company.com";
     "EmailCc": "terese@company.com";
     "EmailInternetMessageId": "1533123860.7.1655130748637@jorge-Bravo-15-Bac";
     "EmailBodyPreview": "These are the contents in text format of the 8 email of test data. It has the following categories:orange; black; white; purpleYou can try to filter this body using any JS method like /.*8.*/.test(emailBody) or emailBody.includes(8)";
     "EmailImportance": "";
     "EmailConversationId": "";
     "EmailIsRead": 1;
     "EmailIsDraft": 0;
     "EmailHasAttachment": 0;
     "Type": "EMAIL";
     "Version": 1;
     "Plugin": "gw"
   }
   total-backup=6
   total-index=4
In case your expression is not valid, the plugin will also inform about that with the following message:
  • error=Error listing elements. Cause: Predicate test error!! Review your query ….

Fileset examples

Backup Full MailBox of some users, but excluding some labels:

Fileset Example
FileSet {
   Name = fs-gw-drive-adjon-users-notemp
   Include {
      Options { signature = MD5 }
      Plugin = "gw: service=email credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=G39add31l1 admin_user_email=super@baculasystems.com
   user=\"adelev@baculasystems.com,jonis@baculasystems.com\" email_files_exclude=\"*.temporary\""
   }
}

Backup all MailBoxes:

Fileset Example
FileSet {
   Name = fs-gw-email-all
   Include {
      Options { signature = MD5 }
      Plugin = "gw: service=email credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=G39add31l1 admin_user_email=super@baculasystems.com"
   }
}

Backup only the Inbox label of some users:

Fileset Example
FileSet {
   Name = fs-gw-email-2user-inbox
   Include {
      Options { signature = MD5 }
      Plugin = "gw: service=email credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=G39add31l1 admin_user_email=super@baculasystems.com
   user="peter@baculasystems.com,john@baculasystems.com" email_files=inbox"
   }
}

Backup some users and include settings:

Fileset Example
FileSet {
   Name = fs-gw-email-2user-mime
   Include {
      Options { signature = MD5 }
      Plugin = "gw: service=email credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=G39add31l1 admin_user_email=super@baculasystems.com
      user="peter@baculasystems.com,miriam@baculasystems.com" email_settings=yes"
   }
}

System labels

Google Mail can present the labels information in local languages to the user.

In general, there is no ‘multilanguage’ support, in the sense that labels must be included with their original name. For example, if you create a label named ‘books’, you cannot expect it to be backed up if you use something like ‘livres’ or ‘libros’ from other languages. You need to use the real name that was used to create such label.

There is one very important special case though, which is ‘system labels’. System labels are labels like ‘inbox’, ‘sent’ … A full list can be found here: https://developers.google.com/gmail/api/guides/labels

These kind of labels can be ‘found’ by the plugin using their standard name, instead of their internal id, as it’s the general case. Therefore, for them it is possible to get the label using their English well known word even if the user sees the label with a translated word.

For example, to backup inbox it is needed to use ‘inbox’ even if for some users it is ‘Posteingang’ or ‘boîte de réception’. Google Workspace Plugin will recognize these special words and will query the information through them.

To summarize:

  • System labels -> Use English word

  • Other user labels -> Use original name