Mbox Extension

The Mbox Extension is a Cedar Backup extension used to incrementally back up UNIX-style mbox mail folders via the Cedar Backup command line. It is intended to be run either immediately before or immediately after the standard collect action.

Mbox mail folders are not well-suited to being backed up by the normal Cedar Backup incremental backup process. This is because active folders are typically appended to on a daily basis. This forces the incremental backup process to back them up every day in order to avoid losing data. This can result in quite a bit of wasted space when backing up large mail folders.

What the mbox extension does is leverage the grepmail utility to back up only email messages which have been received since the last incremental backup. This way, even if a folder is added to every day, only the recently-added messages are backed up. This can potentially save a lot of space.

Each configured mbox file or directory can be backed using the same collect modes allowed for filesystems in the standard Cedar Backup collect action (weekly, daily, incremental) and the output can be compressed using either gzip or bzip2.

To enable this extension, add the following section to the Cedar Backup configuration file:

<extensions>
   <action>
      <name>mbox</name>
      <module>CedarBackup2.extend.mbox</module>
      <function>executeAction</function>
      <index>99</index>
   </action>
</extensions>
      

This extension relies on the options and collect configuration sections in the standard Cedar Backup configuration file, and then also requires its own mbox configuration section. This is an example mbox configuration section:

<mbox>
   <collect_mode>incr</collect_mode>
   <compress_mode>gzip</compress_mode>
   <file>
      <abs_path>/home/user1/mail/greylist</abs_path>
      <collect_mode>daily</collect_mode>
   </file>
   <dir>
      <abs_path>/home/user2/mail</abs_path>
   </dir>
   <dir>
      <abs_path>/home/user3/mail</abs_path>
      <exclude>
         <rel_path>spam</rel_path>
         <pattern>.*debian.*</pattern>
      </exclude>
   </dir>
</mbox>
      

Configuration is much like the standard collect action. Differences come from the fact that mbox directories are not collected recursively.

Unlike collect configuration, exclusion information can only be configured at the mbox directory level (there are no global exclusions). Another difference is that no absolute exclusion paths are allowed — only relative path exclusions and patterns.

The following elements are part of the mbox configuration section:

collect_mode

Default collect mode.

The collect mode describes how frequently an mbox file or directory is backed up. The mbox extension recognizes the same collect modes as the standard Cedar Backup collect action (see Chapter 2, Basic Concepts).

This value is the collect mode that will be used by default during the backup process. Individual files or directories (below) may override this value. If all individual files or directories provide their own value, then this default value may be omitted from configuration.

Note: if your backup device does not suppport multisession discs, then you should probably use the daily collect mode to avoid losing data.

Restrictions: Must be one of daily, weekly or incr.

compress_mode

Default compress mode.

Mbox file or directory backups are just text, and often compress quite well using gzip or bzip2. The compress mode describes how the backed-up data will be compressed, if at all.

This value is the compress mode that will be used by default during the backup process. Individual files or directories (below) may override this value. If all individual files or directories provide their own value, then this default value may be omitted from configuration.

Restrictions: Must be one of none, gzip or bzip2.

file

An individual mbox file to be collected.

This is a subsection which contains information about an individual mbox file to be backed up.

This section can be repeated as many times as is necessary. At least one mbox file or directory must be configured.

The file subsection contains the following fields:

collect_mode

Collect mode for this file.

This field is optional. If it doesn't exist, the backup will use the default collect mode.

Restrictions: Must be one of daily, weekly or incr.

compress_mode

Compress mode for this file.

This field is optional. If it doesn't exist, the backup will use the default compress mode.

Restrictions: Must be one of none, gzip or bzip2.

abs_path

Absolute path of the mbox file to back up.

Restrictions: Must be an absolute path.

dir

An mbox directory to be collected.

This is a subsection which contains information about an mbox directory to be backed up. An mbox directory is a directory containing mbox files. Every file in an mbox directory is assumed to be an mbox file. Mbox directories are not collected recursively. Only the files immediately within the configured directory will be backed-up and any subdirectories will be ignored.

This section can be repeated as many times as is necessary. At least one mbox file or directory must be configured.

The dir subsection contains the following fields:

collect_mode

Collect mode for this file.

This field is optional. If it doesn't exist, the backup will use the default collect mode.

Restrictions: Must be one of daily, weekly or incr.

compress_mode

Compress mode for this file.

This field is optional. If it doesn't exist, the backup will use the default compress mode.

Restrictions: Must be one of none, gzip or bzip2.

abs_path

Absolute path of the mbox directory to back up.

Restrictions: Must be an absolute path.

exclude

List of paths or patterns to exclude from the backup.

This is a subsection which contains a set of paths and patterns to be excluded within this mbox directory.

This section is entirely optional, and if it exists can also be empty.

The exclude subsection can contain one or more of each of the following fields:

rel_path

A relative path to be excluded from the backup.

The path is assumed to be relative to the mbox directory itself. For instance, if the configured mbox directory is /home/user2/mail a configured relative path of SPAM would exclude the path /home/user2/mail/SPAM.

This field can be repeated as many times as is necessary.

Restrictions: Must be non-empty.

pattern

A pattern to be excluded from the backup.

The pattern must be a Python regular expression. [19] It is assumed to be bounded at front and back by the beginning and end of the string (i.e. it is treated as if it begins with ^ and ends with $).

This field can be repeated as many times as is necessary.

Restrictions: Must be non-empty