dup-composer

Dup-composer is a front-end script for Duplicity, that lets you define your backups in a configuration file and execute them in a simple way.


License
MIT
Install
pip install dup-composer==0.4.0.dev2

Documentation

Dup-composer

Dup-composer is a front-end script for Duplicity, that lets you define your backups in a configuration file and execute them in a simple way.

This is work in progress, hence it is not ready for production use yet.

To get started:

  • Install Duplicity version 0.7 or newer.
  • Install Dup-composer.
  • Create your YAML configuration file and define your backup groups configuration in this file.
  • Execute dupcomp, define the groups you wish to execute and whether you want to do a backup or restore.

You can find more detail for each step in its respective section.

Requirements

  • Duplicity 0.7+ installed and the duplicity script on your PATH.
  • Python 3.5+
  • Keyring 19.2+*
  • SecretStorage 3.1.1+*
  • PyYAML 5.1+*

* Will be installed by pip automatically.

Installation

Install from PyPI:

pip3 install dup-composer

Configuration

The configuration should follow YAML 1.1 syntax.

Let's start at the top, with the list of backup groups - the three dots (...) are placeholders for child and scalar nodes:

backup_groups:
  my_first_backup_group:
    ...
  my_second_backup_group:
    ...

The parent node of the groups is called backup_groups, which is currently the root of the configuration structure, but further configuration nodes might be added on the top level in the future.

Backup group

For each backup group, you have to have the following structure in place:

my_first_backup_group:
  encryption:
    ...
  backup_provider:
    ...    
  backup_file_prefixes:
    ...
  volume_size: ...
  sources:
    ...

The encryption node (mandatory) is the parent of the encryption related configuration, the children of backup_provider(mandatory) specify all the provider related properties. backup_file_prefixes is optional and contains the child nodes for archive, signature and manifest file prefix configuration. The volumes_size (mandatory) node determines the size of the backup archive file chunks in MBs.

Encryption

There are primarily two ways to set up the encryption node at the moment.

Encryption is turned off:

encryption:
  enabled: no

Encryption is turned on:

encryption:
  enabled: yes
  gpg_key: 123456789ABCDEFF
  gpg_passphrase: examplepassphrase123

If the enabled node is set to no, encryption is disabled, there is no need to configure the gpg_key and gpg_passphrase nodes. When encryption is enabled however, they are mandatory. This key will be used both for signing and encrypting the backup data.

Keyring support: The value of gpg_passphrase can also be read from a keyring. See the keyring support document for details on setting up the keyring to use with Dup-composer. Once the keyring has been set up, you can specify the passphrase in the following format: gpg_passphrase: ['service_name_in_the_keyring', 'account_name_in_the_keyring'].

Backup provider

The backup_provider configuration to be configured largely depends on the type of the provider, determined by the URL scheme:

backup_provider:
  url: file://

This configuration sets Dup-composer up to save the backup files on the local filesystem. There is no need to specify a concrete path here, as that will be determined by the sources section of the configuration. The URL will just set the context for those paths.

For a remote SFTP or SCP backup, you need a slightly different configuration:

backup_provider:
  url: sftp://myscpuser@host.example.com/
  password: examplepassword123

In this case, you need to specify the username of the remote SFTP host in the first part of the SFTP URL, which is what you would do using Duplicity directly as well. Use the password node to specify the password.

You can use SCP the same way, but use the scp:// protocol in the URL instead of sftp://.

Keyring support: The value of password can also be read from a keyring. See the keyring support document for details on setting up the keyring to use with Dup-composer. Once the keyring has been set up, you can specify the password in the following format: password: ['service_name_in_the_keyring', 'account_name_in_the_keyring'].

Finally, you have to configure AWS S3 like this:

backup_provider:
  url: s3://s3.sa-east-1.amazonaws.com/my-backup-bucket
  aws_access_key: EXAMPLEACCESSKEY
  aws_secret_key: ExAmPlESeCrEtKeY

The S3 bucket URL is configured as the url node value, while aws_access_key and aws_secret_key need to contain your AWS generated keys for the bucket. Like with the rest of the providers, the actual path, folder, within the bucket shouldn't be added to the URL.

Keyring support: The value of aws_secret_key can also be read from a keyring. See the keyring support document for details on setting up the keyring to use with Dup-composer. Once the keyring has been set up, you can specify the secret key in the following format: aws_secret_key: ['service_name_in_the_keyring', 'account_name_in_the_keyring'].

Backup file prefixes

The next feature comes handy if you want to prefix the generated backup file names in a specific way. I use this to set up bucket rules in S3, that move my archive files to Glacier. Here is an example of the configuration:

backup_file_prefixes:
  manifest: manifest_
  archive: archive_
  signature: signature_

The prefixes can be specifically set up for each file type generated at the backup location. Set these up as needed; you can leave the backup_file_prefixes node out altogether, if you don't need this feature.

Volumes

The volume_size node is rather simple: a number should be given as its value; this determines the archive size in megabytes.

Sources

Under the sources node in the configuration hierarchy, you can specify a list of locations (paths) you want to back up, where to back them up and where the restored data should go. You can set up multiple sources within a single group. Here is an example set of two sources configured:

sources:
  /var/www/html:
    backup_path: /root/backups
    restore_path: /root/restored
  /home/tommy:
    backup_path: /home/bkup/my-laptop-backups
    restore_path: /root/restored-from-backup

The source child nodes /var/www/html and /home/tommy determine the directory you want to back up, and backup_path prescribes the location the backup files will be saved to. In practice, the value of backup_path will be appended to the value of the provider url node discussed earlier; hence these two fragments give the true backup location. restore_path is not used during the backup step, but specifying it is mandatory at the moment. I will remove this requirement very soon, as it doesn't make any sense, until an actual restore has to happen.

There are a few limitations on the path data provided in this configuration:

  • They can't begin with a hyphen "-".
  • They can't contain any backslash characters.
  • They can't contain any newline characters.
  • Extra: Make sure to check quoting rules if you need to add any characters, that have a special meaning in the YAML syntax.

Include and exclude filters

You can configure a list of paths included in or excluded from the backup for each backup source. The configuration ends up being passed to Duplicity through the --include and --exclude command line options.

For example, taking the following configuration file filtered-backup.yml:

backup_groups:
  my_filtered_backup:
    encryption:
      enabled: no
    backup_provider:
      url: file://
    volume_size: 400
    sources:
      /var/www/html:
        backup_path: /home/backups/web_server_docroot
        restore_path: /var/www/html
        filters:
          - type: exclude
            path: /var/www/html/no_bak
          - type: include
            path: /var/www/thml/no_bak/important

and running Dup-composer like this:

dupcomp -c filtered-backup.yml backup

will result in the following Duplicity command:

duplicity --no-encryption --volsize 400 --exclude /var/www/html/no_bak --include /var/www/html/no_bak/important /var/www/html file:///home/backups/web_server_docroot

Important:

  • You can list as many include and exclude filters as needed.
  • The order in which you list the filters matters, as it will be passed in the same order to Duplicity. Please consult the Duplicity man pages to understand how the order impacts the processing.
  • The filters are only supported for backups.

Example

backup-compose.yml example:

backup_groups:
  my_local_backups:
    encryption:
      enabled: no
    backup_provider:
      url: file://
    volume_size: 200
    sources:
      /var/www/html:
        backup_path: /root/backups/var/www/html
        restore_path: /root/restored/var/www/html
      /home/tommy:
        backup_path: /root/backups/home/tommy
        restore_path: /root/restored/home/tommy
  my_s3_backups:
    encryption:
      enabled: yes
      gpg_key: xxxxxx
      gpg_passphrase: xxxxxx
    backup_provider:
      url: s3://s3.sa-east-1.amazonaws.com/my-backup-bucket
      aws_access_key: xxxxxx
      aws_secret_key: xxxxxx
    backup_file_prefixes:
      manifest: manifest_
      archive: archive_
      signature: signature_
    volume_size: 50
    sources:
      /etc:
        backup_path: /etc
        restore_path: /root/restored/etc
      /home/shared:
        backup_path: /home/shared
        restore_path: /root/restored/home/shared
  my_sftp_backups_with_keyring:
    encryption:
      enabled: no
    backup_provider:
      url: sftp://myscpuser@backuphost1.example.com/
      password: ['backuphost1', 'myscpuser']
    keyring:
      username: keyringuser
      bus_address: /run/user/1003/bus
    volume_size: 200
    sources:
      /home/fun:
        backup_path: /home/fun
        restore_path: /root/restored/home/fun
      /home/katy:
        backup_path: /home/katy
        restore_path: /root/restored/home/katy

Configuration change safeguard

To reduce the risk of unintentional changes, or changes that might have a negative effect on existing backup chains, Dup-composer will print an informational message and abort the execution if any of the existing backup groups are changed. The user can verify, if the change is intentional and if so, rerun Dup-composer with the -s option to disable the safeguard.

Usage

dupcomp -h
-----
usage: dupcomp.py [-d] [-s] [-f] [-c <configpath>] backup|restore [backup_group1 backup_group2 ...]

optional arguments:
 -d                dry run (just print the commands to be executed)
 -c <configpath>   use the configuration file at <configpath>
 -s                skip the configuration change safeguard step
 -f                force full backup
 -----