Note
You can download this article as a PDF
Google Workspace Plugin
Overview
This white-paper presents how to protect the most relevant elements of Google Workspace services using Bacula Enterprise.
Features
The Bacula Enterprise Google Workspace Plugin is a very easy to deploy and configure plugin supporting the following services:
Google Drive
Google Mail
It is shipped with advanced concurrency, resiliency, and flexibility features in addition to covering the most relevant Google Workspace backup use cases. A full feature list is presented below:
Common features
Google Workspace APIs based backups
Support for free Gmail accounts
Support for accounts under a Google Workspace subscription
Multi-service concurrency capabilities
Multi-threaded processes
Advanced tuning configurations
Automatic concurrency of fetching processes
Generation of user-friendly report for restore operations
Network resiliency mechanisms
Latest Google Authentication mechanisms
Discovery/List/Query capabilities
Restore objects to Google Workspace
To original entity
To any other entity
Restore any object to file-system
Restore HTML report to user mailbox or user drive
Backup and Restore of Google Drive
Backup and Restore of Users My Drive
Backup and Restore of Shared Drive Units
Hash check during backup and restore to ensure data integrity
Incremental & Differential backup
Includes advanced delta function for improved performance
Advanced selection capabilities
Include/exclude by name
Automatic discovery to backup everything
Include/exclude by RegEx
Folder selection capabilities for backup
Include/exclude by name
Automatic discovery to backup everything
Include/exclude by regular expressions
Support for regular files and also native Google Workspace files (export)
Folder and file granularity for restore
Computed hash check at backup and restore time
Backup and restore of permissions shares
Backup and restore of shared elements with users
Backup and restore of Google Drive file versions
Backup and restore of file comments
Backup and restore of trash
Backup and Restore of Google Mail (GMail)
Backup and Restore of email messages
Messages metadata
Messages content
Backup and Restore of attachments
Backup and Restore of mailbox settings
Auto-Forwarding, Imap, Language and Pop settings
Delegates
Filters
SendAs addresses
Forwarding addresses
Incremental & Differential backup with Delta function
Includes advanced delta function for improved performance
Advanced selection capabilities
Include/exclude users by name
Automatic discovery to backup all Workspace users
Include/exclude users by RegEx
Label selection capabilities for backup
Include/exclude by name
Automatic discovery to backup all of them
Include/exclude by regular expressions
Export mail messages to mime RFC 822 local files
Export attachments to local files
Restore to original GMail mailbox
Restore to a different user GMail mailbox
Restore to the original labels
Restore to a specific label
Fully indexed information into Bacula Catalog
Advanced search capabilities for restore operations
Privacy excluding features:
Ability to exclude message fields from the index
Exclude private or spam messages through powerful filtering capabilities
Note
Future modules
Bacula Google Workspace Plugin will include more modules in the future, like Google Calendar among others.
Requirements
Bacula Google Workspace Plugin supports free Gmail accounts and Workspace accounts.
In order to protect Workspace accounts it is needed to have a Google Workspace active subscription: https://workspace.google.com/intl/es-419/pricing.html
On the other hand, it is necessary to have full administrative access to the target associated Organization to protect in order to generate a Google Application with all the needed permissions that will be used to communicate with this plugin.
In order to protect free accounts it is just needed to prepare some configurations in Google Cloud Platform, logging in with the user to protect, before using the plugin. Please refer to the authentication section of this document to have further details.
Currently, the plugin must be installed on a Linux based OS (RH, Debian, Ubuntu, SLES ..) where a Bacula Enterprise File Daemon is installed. Bacula Systems may address support for running this plugin on a Windows platform in a future version.
The OS where the File Daemon is installed must have installed Java version 11 or above.
Memory and computation requirements completely depend on the usage of this plugin (concurrency, environment size, etc). However, it is expected to have a minimum of 4GB RAM in the server where the File Daemon is running. By default, every job could end up using up to 512Mb of RAM in demanding scenarios (usually it will be less). However, there can be particular situations where this could be higher. This memory limit can be adjusted internally (see Out of Memory). Refer to the Scope section below for any service specific requirements.
Why protecting Google Workspace?
This is a common question that arises frequently among IT and Backup professionals when it comes to SaaS or Cloud services, so it is important to clearly understand it.
It is a fact that Google or any cloud provider offers some capabilities intended to prevent data loss such us:
Usually, all data stored in cloud services is geo-replicated using the underlying cloud infrastructure to have the information stored into several destinations automatically and transparently. Therefore, complete data loss because of hardware failures are very unlikely to happen.
Google Data Loss Prevention service: This is a policy based service capable of detecting filtered content and act upon it encrypting it or modifying it in order to protect it (remove headers, etc). This is not a backup tool, it is a service to prevent undesired actions to the content stored in Google Workspace (for example sharing confidential information with the wrong people).
Retention policies of Google Workspace: Google retains a maximum of 30 days of deleted information from active subscriptions. Therefore it is possible to recover accidental deleted items inside that period.
There is no other data protection mechanism. Below we show a list of challenges that are not covered by cloud services:
No Ransomware protection: If data suffers an attack and becomes encrypted, data is lost.
No malicious attacker protection: If data is deleted permanently, data is lost.
No real point-in-time recovery, and recoveries of partially deleted files are limited to 30 days.
It is not possible to align data protection of Google Workspace services to general retention periods or policies longer than 30 days.
No automated way to extract any data from the cloud to save it in external places (this could lead to eventual compliance problems)
Scope
Bacula Enterprise Google Workspace Plugin is applicable on environments using any Workspace subscription.
This paper presents solutions for Bacula Enterprise version 14.1 and later, and is not applicable to prior versions.
Note
Important considerations
Before using this plugin, please carefully read the elements discussed in this section.
Empty files
In general, empty files (files with 0 byte contents) are simply not backed up by Google Workspace plugin. In particular, Google Drive files will show a message in the joblog to inform about empty files detected and so not processed.
Files and objects spooling
In general, this plugin backups two types of information:
Objects
Files
Objects are elements representing some entity in Google Workspace such as a files metadata.
While objects are directly streamed from memory to the backup engine, files need to be downloaded to the FD host before being sent. This is done in order to make some checks and to improve overall performance, as this way operations can be paralleled. Every file is removed just after being completely downloaded and sent to the backup engine.
The path used for this purpose is established by the ‘path’ plugin variable, that usually is set up in the gw_backend script with the value: /opt/bacula/working
Inside the path variable, a ‘spool’ directory will be created and used for those temporary download processes.
Therefore, it is necessary to have at least enough disk space available for the size of the largest file in the backup session. If you are using concurrency between jobs or through the same job (by default this is the case through the concurrent_threads=5 parameter), you would need at least that size for the largest file multiplied by the number of operations in parallel you run.
For emails it is important to note that download operations are done in one step because of some API requirements. This means the jvm should have enough memory to load those downloaded files inside RAM. In case you suffer any memory issue, please refer to the troubleshooting section to find out how to increase it.
Accurate Mode and Virtual Full Backups
Accurate mode and Virtual Full backups are not supported. These features will be addressed in future versions of this plugin.
Google Workspace APIs General Disclaimer
Google Workspace APIs are owned by Google and they can change or evolve at any time. Almost all service APIs are actively developed, containing new features every week, even if the version number of the service is not changed as a result of any of those additions. Just as an example, Google Drive API now is tagged as v3 (and this plugin is using that version to work).
This situation is significantly different from traditional on-premise software, where each update is clearly numbered and controlled for a given server, so applications consuming that software, can clearly state what is offered and what are the target supported versions.
Google is committed to try not to break any existing functionality that could affect external applications. However, this situation can actually happen and therefore, cause some occasional problems with this plugin. Bacula Systems controls this with an advanced automatic monitoring system which is always checking the correct behavior of existing features, and will react quickly to that hypothetical event, but please be aware of the nature and implications of this kind of cloud technologies.
Architecture
Bacula Enterprise Google Workspace Plugin is using several Google Workspace APIs to perform almost all of its operations. Therefore, the plugin is working at the maximum granularity that the service provides.
All the information is gotten using HTTP requests to Google Cloud from the FD where the plugin is installed.
The plugin will contact a Google Cloud Platform application that needs to be manually created and configured before using the plugin. It will serve as a bridge to download the required data or objects during backup time and send them to the Storage Daemon. Conversely, the plugin will receive them from an SD and perform uploads as needed during a restore operation.
The implementation is done through a Java Daemon, therefore Java is a requirement in the FD host. For more information about how to create the application in GCP, please, consult Authorization section.
Below is a simplified vision of the architecture of this plugin inside a generic Bacula Enterprise deployment:
Listed below is the information that can be protected using this plugin:
Google Drive
My Drive of users
Folders
Native Google services files (gdocs, gslides, gpresentation.. Export and download)
All other files (regular download)
File Versions
Trash bin
Shared drives
Folders
Native Google services files (gdocs, gslides, gpresentation.. Export and download)
All other files (regular download)
File Versions
Trash bin
Shared permissions (direct access, share links, expiration times..)
SharedWithMe User files
Files comments
Google Mail
Mailbox user Labels
System labels: Inbox, Sent, Draft, Spam …
User labels
Mailbox user Mails
Metadata
Contents
Mail Attachments
Mailbox user Settings
Auto-Forwarding settings
Imap settings
Language settings
Pop settings settings
Delegates addresses
Filters
SendAs addresses
Forwarding addresses
All the metadata information of each object is stored in JSON format preserving all their original values.
Services and Features
In this section we will dig into how this plugin behaves for each particular service, describing special features and and behaviors that require an extended description.
Special features
In the following section, special features and behaviors are detailed.
Installation
The Bacula File Daemon and the Google Workspace Plugin need to be installed on the host that is going to connect to for cloud based services. The plugin is implemented over a Java layer, therefore it can be deployed on the platform better suited for your needs among any of the officially supported platforms of Bacula Enterprise (RHEL, SLES, Debian, Ubuntu, etc). Please, note that you may want to deploy your File Daemon and the plugin on a virtual machine directly deployed in Google Cloud Platform in order to reduce the latency between it and the Google Workspace APIs.
The system must have Java >= 11 installed (openjdk-11-jre for example) and the Java executable should be available in the system PATH.
Bacula Packages
We are taking Debian Buster as the example base system to proceed with the installation of the Bacula Enterprise Google Workspace Plugin. In this system, the installation is most easily done by adding the repository file suitable for the existing subscription and the Debian version utilized. An example would be /etc/apt/sources.list.d/bacula.list with the following content:
# Bacula Enterprise
deb https://www.baculasystems.com/dl/@customer-string@/debs/bin/@version@/buster-64/ buster main
deb https://www.baculasystems.com/dl/@customer-string@/debs/gw/@version@/buster-64/ buster gw
After that, a run of apt update is needed:
apt update
Then, the plugin may be installed using:
apt install bacula-enterprise-google-workspace-plugin
The plugin has two different packages implied that should be installed automatically with the command shown:
bacula-enterprise-google-workspace-plugin
bacula-enterprise-google-workspace-plugin-libs
Alternately, manual installation of the packages may be done after downloading the packages from your Bacula Systems provided download area, and then using the package manager to install. An example:
dpkg -i bacula-enterprise-*
The package will install the following elements:
Jar libraries in /opt/bacula/lib (such as bacula-google-workspace-plugin-x.x.x.jar and bacula-google-workspace-plugin-libs-x.x.x.jar). Please note that the version of those jar archives is not aligned with the version of the package. However, that version will be shown in the joblog in a message like ‘Jar version:X.X.X’.
Plugin connection file (gw-fd.so) in the plugins directory (usually /opt/bacula/plugins)
Backend file (gw_backend) that invokes the jar files in /opt/bacula/bin. This backend file searches for the most recent bacula-google-workspace-plugin-x.x.x.jar file in order to launch it, even though usually we should have only one file.
Configuration
Fileset Configuration
Once the plugin is successfully authorized, it is possible to define regular filesets for backup jobs in Bacula, where we need to include a line similar to the one below, in order to call the Google Workspace Plugin:
FileSet {
Name = FS_GW
Include {
Options {
signature = MD5
...
}
Plugin = "gw: <gw-parameter-1>=<gw-value-1> <gw-parameter-2>=<gw-value-2> ..."
}
}
It is strongly recommended to use only one ‘Plugin’ line in every fileset. The plugin offers the needed flexibility to combine different modules or entities to backup inside the same plugin line. Different workspaces, in case of existing, should be using different filesets and different jobs.
Below sub-sections list all the parameters you can use to control GW Plugin behavior.
In this plugin, any parameter allowing a list of values can be assigned with a list of values separated by ‘,’.
Common parameters
These parameters are common and applicable to all the modules of the Google Workspace Plugin.
Option |
Required |
Default |
Values |
Example |
Description |
---|---|---|---|---|---|
abort_on_error |
No |
No |
No, Yes |
Yes |
If set to Yes: Abort job as soon as any error is found with any element. If set to No: Jobs can continue even if it they found a problem with some elements. They will try to backup or restore the other and only show a warning |
config_file |
No |
The path pointing to a file containing any combination of plugin parameters |
/opt/bacula/etc/gw.settings |
Allows to define a config file where configure any parameter of the plugin. Therefore you don’t need to put them directly in the Plugin line of the fileset. This is specially useful for shared data between filesets and/or sensitive data as customer_id. |
|
log |
No |
/opt/bacula/working/gw/gw-debug.log |
An existing path with enough permissions for File Daemon to create a file with the provided name |
/tmp/gw.log |
Generates additional log in addition to what is shown in job log. This parameter is included in the backend file, so, in general, by default the log is going to be stored in the working directory. |
debug |
No |
0 |
0, 1, 2, 3, 4, 5, 6, 7, 8, 9 |
Debug level. Greater values generate more debug information |
Generates the working/gw/gw-debug.log* files containing debug information which is more verbose with a greater debug number |
path |
No |
/opt/bacula/working |
An existing path with enough permissions for File Daemon to create any internal plugin file |
/mnt/my-vol/ |
Uses this path to store metadata, plugin internal information and temporary files |
customer_id |
No |
String representing the customer id associated to the Google Workspace subscription |
Cbdi2930doi |
The customer id associated to the Google Workspace subscription to be backed up. Please, check the authentication section of this document for more detailed information. Note that this is mandatory if you want to protect a workspace environment, but not needed to protect gmail free accounts. |
|
admin_user_email |
No |
A valid email address of one admin user of the Google Workspace subscription |
The email address of an admin user of the Google Workspace subscription to be protected. Please, check the authentication section of this document for more detailed information. Note that this is mandatory if you want to protect a workspace environment, but not needed to protect gmail free accounts. |
||
credentials_file |
Yes |
The path of the file where credentials are stored |
/opt/bacula/etc/gw_credentials.json |
The path of the file downloaded from the configured Google Cloud application that will act as a bridge in order to allow the communication between this plugin and Google Workspace. Please, check the authentication section of this document for more detailed information. |
|
tokens_path |
No* |
tokens |
A path with enough permissions so File Daemon can write in it |
/home/user/my_path_to_tokens |
The path that will be used to store the login cache for the device code flow authenticated users, which is relative to the path folder folder (usually working/gw/customer_id/tokens_path/). This is not used and not needed for protecting a workspace with a subscription |
auth_port |
No* |
8888 |
An integer with an open port number suitable of receiving the answer from Google Cloud services upon the delegated authentication request |
9999 |
The port to be used to open the internal service to receive the authentication answer from Google Cloud services |
service |
No |
drive, email |
drive |
Establish the service or services that will be backed up. If this is not set, the plugin will try to backup all supported services. It is recommended to split the work among different jobs when several services need to be applied. Therefore, even if this field is not required, it is strongly recommended to use it in every backup job. |
|
proxy_host |
No |
String representing DNS Name or IP address of the http(s) proxy |
myproxy.example.com |
Set up a proxy to make any plugin HTTP connection |
|
proxy_port |
No |
Integer |
3981 |
Set up the proxy port |
|
proxy_user |
No |
String of proxy user |
admin |
Set up the proxy user |
|
proxy_password |
No |
String of proxy password |
myPass123 |
Set up the proxy user password |
The plugin supports two different kind of users: Workspace users and free Gmail users.
For Workspace users, in addition to ‘credentials_file’, the following parameters are mandatory: ‘customer_id’ and ‘admin_user_email’.
For free Gmail users those parameters are not used, but it is possible to customize ‘tokens_path’ and ‘auth_port’.
Advanced common parameters
Following parameters are common to all Google Workspace modules (and even with some other plugins), but are advanced ones. They should not be modified in most common use cases.
Option |
Required |
Default |
Values |
Example |
Description |
---|---|---|---|---|---|
stream_sleep |
No |
1 |
Positive integer (1/10 seconds) |
5 |
Time to sleep when reading header packets from FD and not having a full header available |
stream_max_wait |
No |
120 |
Positive integer (seconds) |
360 |
Max wait time for FD to answer packet requests |
time_max_last_modify_log |
No |
86400 |
Positive integer (seconds) |
43200 |
Maximum time to wait to overwrite a debug log that was marked as being used by other process |
logging_max_file_size |
No |
50MB |
String size |
300MB |
Maximum size of a single debug log file |
logging_max_backup_index |
No |
25 |
Positive integer (number of files) |
50 |
Maximum number of log files to keep |
log_rolling_file_pattern |
No |
gw.log.%d{dd-MMM}.log.gz” |
No, Yes |
Yes |
Log patter for rotated log files |
split_config_file |
No |
= |
Character |
: |
Character to be used in config_file parameter as separator for keys and values |
opener_queue_timeout_secs |
No |
1200 |
Positive integer (seconds) |
3600 |
Timeout when internal object opener queue is full |
publisher_queue_timeout_secs |
No |
1200 |
Positive integer (seconds) |
3600 |
Timeout when internal object publisher queue is full |
The internal plugin logging framework presents some relevant features that we are going to describe:
The “.log” files are rotated automatically. Currently each file can be 50Mb at maximum and the plugin will keep 25 files.
This behavior can be changed using the internal advanced parameters: logging_max_file_size and logging_max_backup_index
The “.err” file can show contents even if no real error happened in the jobs. It can show contents too even if debug is disabled. This file is not rotated, but it is expected to be a small file in general. If you still need to rotate it, you can include it in a general rotating tool like ‘logrotate’.
Backups in parallel and also failed backups will generate several log files. For example: gw-debug-0.log, gw-debug-1.log…
Tuning parameters
These set of parameters are common to all modules and they are advanced ones. They should not be modified in general. They can be used to tune the behavior of the plugin to be more flexible in particular bad network environments or when significant job concurrency is happening, etc.
Option |
Required |
Default |
Values |
Example |
Description |
---|---|---|---|---|---|
backup_queue_size |
No |
30 |
0-50 |
1 |
Number of maximum en-queued internal operations between service static internal threads (there are 3 communicating through queues with the set size: service fetcher, service opener and general publisher to bacula core). This could potentially affect google api concurrent requests and consequently, Google throttling. It is only needed to modify this parameter, in general, if you are going to run different jobs in parallel |
concurrent_threads |
No |
5 |
0-10 |
1 |
Number of maximum concurrent backup threads running in parallel in order to fetch or open data for running download actions. This means every service fetcher and service opener will open this number of child concurrent threads. This will affect google api concurrent requests. Google API can throttle requests depending on a variety of circumstances, but it is directly attached . It is only needed to modify this parameter, in general, if you are going to run different jobs in parallel. If you want to have a precise control of your concurrency through different jobs, please set up this value to 1. Please be careful also with the memory requirements, multi-threaded increases very significantly memory consumption per job |
api_list_page_size |
No |
500 |
1-500 |
350 |
Number of maximum elements got from Google API for each page of objects. Higher number implies less requests, but more memory and more time for each request |
api_timeout |
No |
9000 |
Positive integer (milliseconds) |
60000 |
Google call timeout inside HttpClient |
api_read_timeout |
No |
300 |
Positive integer (milliseconds) |
30000 |
Google read timeout inside HttpClient |
api_retries |
No |
5 |
Positive integer (number of retries) |
10 |
Google number of retries for retry-candidate requests |
api_retry_delay |
No |
5 |
Positive integer (seconds) |
10 |
Google API delay between retries |
general_network_retries |
No |
5 |
Positive integer (number of retries) |
10 |
Number of retries for the general external retry mechanism |
general_network_delay |
No |
50 |
Positive integer (seconds) |
100 |
General Plugin delay between retries |
stats |
No |
No |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
Yes |
Include some stats information in the joblog. Useful to measure task times |
Entity parameters
The following list of parameters are commonly shared through any module used in the same fileset line and are intended to select the target entities to backup. Every module subsection mentions what entities are supported too.
Option |
Required |
Default |
Values |
Example |
Services |
Description |
---|---|---|---|---|---|---|
user |
No |
Valid email addresses of existing users on the selected workspace separated by ‘,’ |
drive, email |
Backup selected services of this list of users. If no user is provided, and no other user parameter is set, all users will be discovered and included in the backup |
||
user_exclude |
No |
Valid email addresses of existing users on the selected workspace separated by ‘,’ |
drive, email |
Exclude selected services of selected users If this is the only parameter found for selection, all elements will be included and this list will be excluded |
||
user_regex_include |
No |
Valid regex |
.*@management\.mydomain.com |
drive, email |
Backup selected services of matching users. |
|
user_regex_exclude |
No |
Valid regex |
.*@guests\.mydomain.com |
drive, email |
Exclude selected services of matching users. If this is the only parameter found for selection, all elements will be included and this list will be excluded |
Backup parameters
Please, check the specific module pages in order to see backup parameters that are applicable only to each of them:
Restore parameters
The plugin is able to restore to the local file system on the server where the File Daemon is running or to the Google Workspace environment. The method is selected based on the value of the where parameter at restore time:
Empty or ‘/’ (example: where=/) → Google Workspace restore will be triggered
Any other path for where (example: where=/tmp) → Local file system restore will be triggered
When using Google Workspace restore option, the following parameters may be modified by selecting ‘Plugin Options’ during the bconsole restore session:
Option |
Required |
Default |
Values |
Example |
Services |
Description |
---|---|---|---|---|---|---|
destination_user |
No |
Existing email address on the target Google Workspace |
drive, email |
Destination User where restore data will be uploaded. If no user is set, every selected file will be restored in the original account |
||
destination_path |
No |
Destination path to be created (or existing) into the selected user (drive folder path) |
RestoreFolder |
drive, email |
Destination folder where all selected files to restore will be restored. If no path is set: - If no user is set either, every element will go to its original location - If a user is set using the variable destination_user: - Elements belonging to destination_user will be restored in their original location - Elements belonging to different users than destination_user will be restored in a new folder using the email address of the original user of the element |
|
send_report |
No |
0 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
1 |
drive, email |
Send a report to the user where every restore action is listed. - In drive service this will generate a new text file in the top restore folder |
allow_duplicates |
No |
1 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
0 |
drive, email |
Set if we allow to have several files with the same name in the same path or not (if not, we can overwrite the file using the ‘Replace’ general restore variable) |
drive_destination_shared_unit |
No |
Existing shared drive name |
MySharedDrive |
drive |
Destination drive shared unit where restored data will be uploaded. If no drive is set, every selected file will be restored in the original shared drive |
|
drive_skip_versions |
No |
1 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
0 |
drive |
Skip restoring former file versions (tagged with ‘###date’) even if they are selected. Important: Notice that this parameter is enabled by default, as we consider not restoring file versions the most common case. You need to disable it in order to have this kind of files restored |
drive_skip_comments |
No |
1 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
0 |
drive |
Skip restoring file comments (located inside the ‘filename_comments’ folder) even if they are selected. Important: Notice that this parameter is enabled by default, as we consider not restoring file comments the most common case. You need to disable it in order to have this kind of information restored |
drive_skip_sharedwitme |
No |
0 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
1 |
drive |
Skip restoring shared with me elements even if they are selected. |
drive_restore_share_permissions |
No |
0 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
1 |
drive |
Restore share permissions of every element in order to regenerate sharing information as allowed identities, shared links, etc. Important: Notice that this parameter is disabled by default, as we consider not restoring sharing permissions the most common case. You need to enable it in order to have shared permissions restored |
email_export |
No |
0 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
1 |
Export selected emails to MIME format in local filesystem (RFC 822) |
|
email_export_attachments_extract |
No |
1 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
0 |
Extract attachments of exported emails as independent files |
|
customer_id |
No |
String representing the customer id associated to the Google Workspace subscription |
Cbdi2930doi |
drive, email |
The customer id associated to the Google Workspace subscription to be backed up. Please, check the authentication section of this document for more detailed information. Note that this is mandatory if you want to protect a workspace environment, but not needed to protect gmail free accounts. |
|
admin_user_email |
No |
A valid email address of one admin user of the Google Workspace subscription |
drive, email |
The email address of an admin user of the Google Workspace subscription to be protected. Please, check the authentication section of this document for more detailed information. Note that this is mandatory if you want to protect a workspace environment, but not needed to protect gmail free accounts. |
||
credentials_file |
No |
The path of the file where credentials are stored |
/opt/bacula/etc/gw_credentials.json |
drive, email |
The path of the file downloaded from the configured Google Cloud application that will act as a bridge in order to allow the communication between this plugin and Google Workspace. Please, check the authentication section of this document for more detailed information. |
|
tokens_path |
No* |
A path with enough permissions so File Daemon can write in it |
/home/user/my_path_to_tokens |
drive, email |
The path that will be used to store the login cache for the device code flow authenticated users, which is relative to the path folder folder (usually working/gw/customer_id/tokens_path/). This is not used and not needed for protecting a workspace with a subscription |
|
auth_port |
No* |
An integer with an open port number suitable of receiving the answer from Google Cloud services upon the delegated authentication request |
9999 |
drive, email |
The port to be used to open the internal service to receive the authentication answer from Google Cloud services |
|
foreign_container_generation |
No |
1 |
0, no, No, false, FALSE, false, off ; 1, yes, Yes, TRUE, true, on |
0 |
drive, email |
Generate a general container (usually a folder) to put inside restored objects coming from different entities. For example, if we restore files from user a@workspace.com into the drive of user b@workspace.com, this option enabled will generate an automatic folder a@workspace.com inside the destination restore folder used over destination user b@workspace.com |
debug |
No |
0, 1, 2 ,3, 4, 5, 6, 7, 8, 9 |
3 |
drive, email |
Change debug level |
Operations
Backup
Google Workspace plugin backup configurations currently have just one specific requirement in the Job resource. Below we show some examples.
Job Example
The only special requirement with Google Workspace jobs is that Accurate mode backups must be disabled, as this feature is not supported at this time.
Job {
Name = gw-myworkspace-backup
FileSet = fs-gw-drive-all
Accurate = no
...
}
FileSet Examples
The plugin supports enough flexibility to configure almost any type of desired backup. Multiple Plugin= lines should not be specified in the Include section of a FileSet for the Google Workspace Plugin.
Fileset examples for every supported service are linked below. For common purposes, the following two examples show how to configure an external config file or configure the number of threads:
Setup external config file:
FileSet {
Name = FS_GW_DRIVE
Include {
Options {
signature = MD5
}
Plugin = "gw: config_file=/opt/bacula/etc/gw.settings service=drive"
}
}
$ cat /opt/bacula/etc/gw.settings
Increase number of threads:
FileSet {
Name = fs-gw-drive-kara
Include {
Options {
signature = MD5
}
Plugin = "gw: credentials_file=/opt/bacula/etc/bacula-gw-plugin-sa-credentials.json
customer_id=\"B01ua5i29\" admin_user_email=\"peter@baculasystems.com\"service=drive
user=kara@baculasystems.com backup_threads=10"
}
}
More fileset examples for:
Restore
Restore operations are done using standard Bacula Enterprise bconsole commands.
The where parameter controls if the restore will be done locally to the File Daemon’s file system or to the Google Workspace service:
where=/ or empty value → Restore will be done over Google Workspace
where=/any/other/path → Restore will be done locally to the File Daemon file system
Restore options are described in the Restore parameters section of this document, so here we are going to simply show an example restore session:
*restore where=/
First you select one or more JobIds that contain files
to be restored. You will be presented several methods
of specifying the JobIds. Then you will be allowed to
select which files from those JobIds are to be restored.
To select the JobIds, you have the following choices:
1: List last 20 Jobs run
2: List Jobs where a given File is saved
3: Enter list of comma separated JobIds to select
4: Enter SQL list command
5: Select the most recent backup for a client
6: Select backup for a client before a specified time
7: Enter a list of files to restore
8: Enter a list of files to restore before a specified time
9: Find the JobIds of the most recent backup for a client
10: Find the JobIds for a backup for a client before a specified time
11: Enter a list of directories to restore for found JobIds
12: Select full restore to a specified Job date
13: Select object to restore
14: Cancel
Select item: (1-14): 5
Automatically selected Client: 127.0.0.1-fd
Automatically selected FileSet: FS_GW
+-------+-------+----------+----------+---------------------+-------------------+
| jobid | level | jobfiles | jobbytes | starttime | volumename |
+-------+-------+----------+----------+---------------------+-------------------+
| 1 | F | 29 | 125,994 | 2022-05-12 17:49:27 | TEST-2022-05-12:0 |
+-------+-------+----------+----------+---------------------+-------------------+
You have selected the following JobId: 1
Building directory tree for JobId(s) 1 ...
27 files inserted into the tree.
You are now entering file selection mode where you add (mark) and
remove (unmark) files to be restored. No files are initially added, unless
you used the "all" keyword on the command line.
Enter "done" to leave this mode.
cwd is: /
$ cd "/@gw/C02uv9t30/users/jorge@baculasystmes.com/drive/my drive/"
cwd is: /@gw/C02uv9t30/users/jorge@baculasystmes.com/drive/my drive/
$ ls
REGRESS_20220512174729/
sharedWithMe/
$ cd REGRESS_20220512174729/
cwd is: /@gw/C02uv9t30/users/jorge@baculasystmes.com/drive/my drive/REGRESS_20220512174729/
$ ls
Elitr.mp4
Elitr.mp4__comments/
Graeco.docx
Graeco.docx__comments/
Interpretaris/
Mnesarchum.ppt
Scelerisque.jpeg
Vivamus.doc
Vivamus.doc__comments/
$ mark *
20 files marked.
$ done
Bootstrap records written to /tmp/regress/working/127.0.0.1-dir.restore.2.bsr
The Job will require the following (*=>InChanger):
Volume(s) Storage(s) SD Device(s)
===========================================================================
TEST-2022-05-12:0 File FileStorage
Volumes marked with "*" are in the Autochanger.
20 files selected to be restored.
Using Catalog "MyCatalog"
Run Restore job
JobName: RestoreFiles
Bootstrap: /tmp/regress/working/127.0.0.1-dir.restore.2.bsr
Where: /
Replace: Always
FileSet: Full Set
Backup Client: 127.0.0.1-fd
Restore Client: 127.0.0.1-fd
Storage: File
When: 2022-05-12 18:03:23
Catalog: MyCatalog
Priority: 10
Plugin Options: *None*
OK to run? (Yes/mod/no): mod
Parameters to modify:
1: Level
2: Storage
3: Job
4: FileSet
5: Restore Client
6: When
7: Priority
8: Bootstrap
9: Where
10: File Relocation
11: Replace
12: JobId
13: Plugin Options
Select parameter to modify (1-13): 13
Automatically selected : gw: credentials_file="/home/jorge/projects/bacula-gw-plugin-sa-2.json" customer_id="C02uv9t30" admin_user_email="jorge@baculasystmes.com" service="drive" user="jorge@baculasystmes.com" drive_files="REGRESS_20220512174729" drive_shared_units_regex_exclude=".*" debug=6
Plugin Restore Options
Option Current Value Default Value
destination_user: *None* (*None*)
destination_path: *None* (*None*)
send_report: *None* (0)
allow_duplicates: *None* (1)
drive_destination_shared_unit: *None* (*None*)
drive_skip_versions: *None* (1)
drive_skip_comments: *None* (1)
drive_skip_sharedwithme: *None* (0)
drive_restore_share_permissions: *None* (0)
customer_id: *None* (*None*)
admin_user_email: *None* (*None*)
credentials_file: *None* (*None*)
tokens_path: *None* (*None*)
auth_port: *None* (*None*)
foreign_container_generation: *None* (1)
debug: *None* (*None*)
Use above plugin configuration? (Yes/mod/no): mod
You have the following choices:
1: destination_user (Destination User)
2: destination_path (Destination Path in google-workspace)
3: send_report (Send report of the restore operation to the affected user)
4: allow_duplicates (Allow Duplicate Objects (Files with the same name in the same folder, emails with same id..))
5: drive_destination_shared_unit (Destination Shared Unit name)
6: drive_skip_versions (Skip restoring file former versions (tagged with '###date') even if they are selected)
7: drive_skip_comments (Skip restoring file comments even if they are selected)
8: drive_skip_sharedwithme (Skip restoring shared with me elements even if they are selected)
9: drive_restore_share_permissions (Restore sharing permissions of the files, so they get shared with the same people than original files)
10: customer_id (Destination Workspace customer id)
11: admin_user_email (Destination Workspace admin user email)
12: credentials_file (Credentials file path to be used for authentication)
13: tokens_path (Directory to store authorization tokens for delegated permissions)
14: auth_port (Port to receive response from delegated authentication process)
15: foreign_container_generation (Generate a general container (usually a folder) to put inside restored objects coming from different entities)
16: debug (Change debug level)
Select parameter to modify (1-16): 2
Please enter a value for destination_path: restored1
Plugin Restore Options
Option Current Value Default Value
destination_user: *None* (*None*)
destination_path: restored1 (*None*)
send_report: *None* (0)
allow_duplicates: *None* (1)
drive_destination_shared_unit: *None* (*None*)
drive_skip_versions: *None* (1)
drive_skip_comments: *None* (1)
drive_skip_sharedwithme: *None* (0)
drive_restore_share_permissions: *None* (0)
customer_id: *None* (*None*)
admin_user_email: *None* (*None*)
credentials_file: *None* (*None*)
tokens_path: *None* (*None*)
auth_port: *None* (*None*)
foreign_container_generation: *None* (1)
debug: *None* (*None*)
Use above plugin configuration? (Yes/mod/no): yes
Run Restore job
JobName: RestoreFiles
Bootstrap: /tmp/regress/working/127.0.0.1-dir.restore.2.bsr
Where: /
Replace: Always
FileSet: Full Set
Backup Client: 127.0.0.1-fd
Restore Client: 127.0.0.1-fd
Storage: File
When: 2022-05-12 18:03:23
Catalog: MyCatalog
Priority: 10
Plugin Options: User specified
OK to run? (Yes/mod/no): yes
Job queued. JobId=3
Restore by service
In this section some example restore configurations will be shown:
Cross workspace restore
You can perform cross-workspace restores using the restore variables:
customer_id
admin_user_email
credentials_file
Obviously, it is needed to set up the destination workspace values, where a connection application should have been also set up previously to allow the connection.
List
It is possible to list information using the bconsole .ls command and providing a path. In general, we need to provide the service parameter, the implied entity and a path representing a folder.
There are some general commands (like listing users), while the rest of the commands need to have the service set
Below some examples:
List general info: Users of a workspace
Here we are showing these 3 commands using the bconsole .ls command, but notice you may also use them with the query interface (keep your variable values, but apply something like: .query plugin=”…” client=xxxx parameter=xxx)
*.ls plugin="gw: credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=Alo9783c12 admin_user_email=jorge@baculasystems.com" client=127.0.0.1-fd path=users
Connecting to Client 127.0.0.1-fd at 127.0.0.1:8102
-rw-r----- 1 nobody nogroup -1 1970-01-01 00:59:59 /jorge@baculasystems.com
-rw-r----- 1 nobody nogroup -1 1970-01-01 00:59:59 /kara@baculasystems.com
-rw-r----- 1 nobody nogroup -1 1970-01-01 00:59:59 /john@baculasystems.com
2000 OK estimate files=3 bytes=0
List Google Drive contents
*.ls plugin="gw: credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=Alo9783c12 admin_user_email=jorge@baculasystems.com user=jorge@baculasystems.com service=drive" client=127.0.0.1-fd path=/
Connecting to Client 127.0.0.1-fd at 127.0.0.1:8102
-rw-r----- 1 nobody nogroup 373568 2022-05-13 11:24:51 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-13_11.24.33_06.html
-rw-r----- 1 nobody nogroup 372115 2022-05-12 18:03:51 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-12_18.03.36_10.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-12 18:03:41 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/restored1/
-rw-r----- 1 nobody nogroup 373602 2022-05-12 17:49:52 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-12_17.49.34_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-12 17:49:39 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220512174934/
drwxr-xr-x 1 nobody nogroup -1 2022-05-12 17:48:25 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220512174729/
-rw-r----- 1 nobody nogroup 372471 2022-05-12 10:47:16 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-12_10.47.03_10.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-12 10:47:08 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/RESTORED_SKIPVER_REGRESS_20220512104703/
-rw-r----- 1 nobody nogroup 376389 2022-05-12 10:46:50 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-12_10.46.32_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-12 10:44:35 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/SOURCE_REGRESS_20220512104335/
-rw-r----- 1 nobody nogroup 376609 2022-05-10 12:56:36 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_12.56.12_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 12:56:17 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220510125612/
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 12:55:31 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/SRC_INCLUDE_REGRESS_20220510125529/
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 12:54:57 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/trash/SRC_REMOVE_REGRESS_20220510125403/
-rw-r----- 1 nobody nogroup 373736 2022-05-10 12:49:34 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_12.49.18_10.html
-rw-r----- 1 nobody nogroup 376810 2022-05-10 12:49:05 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_12.48.42_06.html
-rw-r----- 1 nobody nogroup 374586 2022-05-10 12:31:51 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_12.31.28_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 12:31:33 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220510123128/
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 12:30:10 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/trash/SRC_REMOVE_REGRESS_20220510123006/
-rw-r----- 1 nobody nogroup 372465 2022-05-10 12:29:46 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_12.29.31_10.html
-rw-r----- 1 nobody nogroup 376395 2022-05-10 12:29:19 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_12.28.58_06.html
-rw-r----- 1 nobody nogroup 372210 2022-05-10 12:25:23 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_12.25.04_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 12:25:09 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220510122504/
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 12:23:48 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220510122254/
-rw-r----- 1 nobody nogroup 372218 2022-05-10 11:38:20 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_11.38.04_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 11:38:09 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220510113804/
drwxr-xr-x 1 nobody nogroup -1 2022-05-10 11:36:53 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220510113557/
-rw-r----- 1 nobody nogroup 372210 2022-05-10 11:34:28 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_11.34.11_06.html
-rw-r----- 1 nobody nogroup 372196 2022-05-10 11:27:03 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_11.26.47_06.html
-rw-r----- 1 nobody nogroup 373412 2022-05-10 11:23:53 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_11.23.39_06.html
-rw-r----- 1 nobody nogroup 373574 2022-05-10 11:21:10 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-10_11.20.53_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-09 18:26:39 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/DoComplicateMyLife/
-rw-r----- 1 nobody nogroup 370347 2022-05-09 18:24:18 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-09_18.24.08_18.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-09 18:24:13 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/testingMyRestore/
-rw-r----- 1 nobody nogroup 373586 2022-05-09 17:56:49 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-09_17.56.33_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-09 17:56:38 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220509175633/
-rw-r----- 1 nobody nogroup 373701 2022-05-07 14:16:03 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-07_14.15.48_10.html
-rw-r----- 1 nobody nogroup 376757 2022-05-07 14:15:36 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-07_14.15.15_06.html
-rw-r----- 1 nobody nogroup 371504 2022-05-07 13:57:22 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-07_13.56.39_13.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-07 13:56:44 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/ANADALRG14/
-rw-r----- 1 nobody nogroup 371491 2022-05-07 13:21:14 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-07_13.20.31_12.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-07 13:20:36 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/ADjoker/
-rw-r----- 1 nobody nogroup 382293 2022-05-07 12:35:15 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-07_12.34.32_11.html
-rw-r----- 1 nobody nogroup 23352 2022-05-07 12:34:41 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/Brute.jpeg
-rw-r----- 1 nobody nogroup 18394 2022-05-07 12:34:40 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/Ultricies.txt
-rw-r----- 1 nobody nogroup 8693 2022-05-07 12:34:40 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/Suavitate.ppt
-rw-r----- 1 nobody nogroup 14981 2022-05-07 12:34:40 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/Cetero.txt
-rw-r----- 1 nobody nogroup 15752 2022-05-07 12:34:39 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/Fastidii.ppt
drwxr-xr-x 1 nobody nogroup -1 2022-05-07 12:34:38 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/AFullRestore1/
-rw-r----- 1 nobody nogroup 370327 2022-05-07 12:15:58 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-07_12.15.48_10.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-07 12:15:54 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/AverAlc/
drwxr-xr-x 1 nobody nogroup -1 2022-05-06 17:12:34 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220506171229/
drwxr-xr-x 1 nobody nogroup -1 2022-05-06 16:58:13 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220506165808/
-rw-r----- 1 nobody nogroup 372243 2022-05-06 13:49:21 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-06_13.49.06_06.html
-rw-r----- 1 nobody nogroup 372215 2022-05-06 13:39:36 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-06_13.39.20_06.html
-rw-r----- 1 nobody nogroup 372167 2022-05-06 13:13:51 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-06_13.13.33_06.html
-rw-r----- 1 nobody nogroup 372444 2022-05-06 13:09:03 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-06_13.08.47_10.html
-rw-r----- 1 nobody nogroup 376549 2022-05-06 13:08:34 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-06_13.08.11_06.html
-rw-r----- 1 nobody nogroup 372396 2022-05-06 11:20:51 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-06_11.20.37_10.html
-rw-r----- 1 nobody nogroup 376416 2022-05-06 11:20:24 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-06_11.20.06_06.html
-rw-r----- 1 nobody nogroup 373532 2022-05-06 09:51:58 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-06_09.51.43_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:19:36 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/AutoSimple 2022-05-05 05.19.34/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:14:12 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/AutoSimple 2022-05-05 05.14.10/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:03:02 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:02:51 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:02:46 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:02:35 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:02:31 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:02:23 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:02:20 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:02:12 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:02:09 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 17:01:58 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505170105/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:59:34 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:59:22 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:59:18 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:59:06 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:59:02 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:58:51 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:58:49 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:58:39 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:58:36 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:58:27 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220505165734/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:56:25 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/AutoSimple 2022-05-05 04.56.13/
drwxr-xr-x 1 nobody nogroup -1 2022-05-05 16:56:15 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/AutoSimple 2022-05-05 04.56.13/
-rw-r----- 1 nobody nogroup 373488 2022-05-05 13:55:46 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-05_13.55.30_06.html
-rw-r----- 1 nobody nogroup 373499 2022-05-04 12:03:24 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-04_12.03.07_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-04 12:03:12 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220504120307/
drwxr-xr-x 1 nobody nogroup -1 2022-05-04 12:02:01 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220504120058/
-rw-r----- 1 nobody nogroup 373510 2022-05-03 13:32:34 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-03_13.32.18_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-03 13:32:23 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220503133217/
drwxr-xr-x 1 nobody nogroup -1 2022-05-03 13:31:04 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220503133010/
-rw-r----- 1 nobody nogroup 371497 2022-05-03 13:26:48 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-03_13.26.36_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-03 13:26:41 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/REGRESS_20220503132636/
-rw-r----- 1 nobody nogroup 372431 2022-05-02 17:53:43 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-02_17.53.29_10.html
-rw-r----- 1 nobody nogroup 376394 2022-05-02 17:53:17 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-02_17.52.56_06.html
-rw-r----- 1 nobody nogroup 373668 2022-05-02 13:59:36 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-02_13.59.19_10.html
-rw-r----- 1 nobody nogroup 375148 2022-05-02 13:59:09 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-02_13.58.48_06.html
drwxr-xr-x 1 nobody nogroup -1 2022-05-02 13:56:52 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/SOURCE_REGRESS_20220502135650/
-rw-r----- 1 nobody nogroup 372168 2022-05-02 13:39:07 /@gw/C02uv9t30/users/jorge@baculasystems.com/drive/my drive/BEE_RestoreReport_DRIVE_2022-05-02_13.38.51_06.html
2000 OK estimate files=100 bytes=15,775,509
Other query/list examples
*.ls plugin="gw: credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=Alo9783c12 admin_user_email=jorge@baculasystems.com user=jorge@baculasystems.com service=drive" client=127.0.0.1-fd path=/folder1
List emails inside inbox
*.ls plugin="gw: credentials_file=/opt/bacula/etc/bacula-gw-plugin-credentials.json customer_id=Alo9783c12 admin_user_email=jorge@baculasystems.com user=jorge@baculasystems.com service=email" client=127.0.0.1-fd path=/inbox
Show free users loggedin
*.query plugin="gw: credentials_file=/opt/bacula/etc/bacula-gw-plugin-free.json" client=127.0.0.1-fd parameter=logged-users
Force login of a particular free user
*.query plugin="gw: credentials_file=/opt/bacula/etc/bacula-gw-plugin-free.json" client=127.0.0.1-fd parameter=login:myuser@gmail.com
Best practices
Jobs Distribution
It is recommended to split the target backup between different groups of entities or even having one job per entity (user, drive unit, etc). This way errors in one job will not invalidate a whole backup cycle where some entities have been successful and some others had errors. This also makes it easier to identify the cause of the error.
Concurrency
Google Workspace APIs impose a variety of boundaries that need to be considered. If a boundary is crossed, the corresponding API call will fail and the application will need to wait some amount of time to retry, which is different depending on the boundary crossed.
It is crucial to plan an adequate strategy to backup all the elements without reaching API boundaries. A single job implements some parallelism which can be reduced until a point, if necessary, using the variable backup_queue_size (default value is 30). This variable controls the size of the internal queues communicating the internal threads, that are designed to fetch, open and send every item to Bacula core. Reducing its size will produce, ultimately (with a value of 1 for example), an execution very similar to a single threaded process. On the other hand the plugin has concurrent_threads which controls the number of simultaneous processes fetching and downloading data (default value is 5).
Caution is recommended with the concurrency over the same service (in general, it is recommended a maximum of 4-5 jobs or threads working with the same service) and plan a step-by-step testing scenario before putting it into production. Other important point is the timing schedule, as some boundaries are related to time-frames (number of request per 10 minutes or 1 hour, for example). If you detect you reach boundaries when running all your backups during a single day of the week, please try to use 2 or 3 days and spread the load through them in order to achieve better performance results.
Specifically for the GMail module, in addition to concurrency the plugin uses batch requests that are processed in parallel as soon as it gets the answer. Therefore, throttling can be reached very easily and it’s recommended to not use almost any concurrency with this module. By default, GMail uses 2 threads in parallel. Even with 2 it is expected to have some request throttled. Limits can be raised under request with Google, but if this is not a possibility and you experience throttling problems with parallelism we recommend to disable it completely (setting concurrent_threads = 1).
More information about Google Workspace API boundaries may be found here:
https://developers.google.com/drive/api/guides/limits https://developers.google.com/gmail/api/reference/quota
Performance
The performance of this plugin is highly dependent on many external factors:
ISP latency and bandwidth
Network infrastructure
FD Host hardware
FD Load
…
In summary, it is not possible to establish an exact reference about how much time a backup will need to complete.
Some general guidelines to understand the performance we can get:
Many little objects to protect -> More objects per second, but less speed (MB/s)
Big files to protect -> Less objects per second, but greater speed (MB/s)
It is recommended to benchmark your own environment in base to your requirements and needs.
The automatic concurrency mechanism (using concurrent_threads=x, default is 5) should work well for most scenarios, however, fine tune is possible if we define one job per entity and we control how many of them run in parallel, together to decrease the concurrent_threads value in order to avoid throttling from Google Cloud APIs.
There are many different possible strategies to use this plugin, so please, study what is best suiting for your needs before deploying the jobs for your entire environment, so you can get best possible results:
You can have a job per entity (users, shared drives…) and all services
You can split your workload through a schedule, or try to run all your jobs together.
You can run jobs in parallel or take advantage of concurrent_threads and so run less jobs in parallel
You can backup whole services to backup or select precisely what elements you really need inside each service (folders, paths, exclusions…)
etc.
Specifically for Drive service, in order to maximize the performance we recommend additionally to: - Disable comments backup - Disable version history backup - Run one job per user and use the full Drive (no path selection) so the Delta function is applied. Exclude all shared units in user jobs (drive_shared_units_regex_exclude=.*) - Run one job per shared unit and use the full Drive (no path selection) so the Delta function is applied. Exclude all users in shared unit jobs (users_regex_exclude=.*)
The
restart
command has limitations with plugins, as it initiates the Job from scratch rather than continuing it. Bacula determines whether a Job is restarted or continued, but using therestart
command will result in a new Job.
Troubleshooting
Listed in this section are some scenarios that are known to cause issues.
Out of Memory
- If you ever face OutOfMemory errors of the Java daemon (you will find them in the gw-debug.err file),
- you are likely using a high level of concurrency through internal concurrent_threads parameter and/or parallel jobs.
To overcome this situation you can:
Reduce concurrent_threads parameter
Reduce the number of jobs running in parallel
If you cannot do that you should increase JVM memory.
To increase JVM memory, you will need to:
Create a this file: ‘/opt/bacula/etc/gw_backend.conf’.
Below, an example of the contents: GW_JVM_MIN=2G GW_JVM_MAX=8G
Those values will define the MIN (GW_JVM_MIN) and MAX (GW_JVM_MAX) memory values assigned to the JVM Heap size. In this example we are setting 2Gb for the minimum, and 8Gb for the maximum. In general, those values should be more than enough. Please, be careful if you are running jobs in parallel, as very big values and several jobs at a time could quickly eat all the memory of your host.
The ‘/opt/bacula/etc/gw_backend.conf’ won’t be modified through package upgrades, so your memory settings will be persistent.