cassandra_backups
is a tool to backup a Cassandra cluster using nodetool snapshots and
incremental backups on S3.
The scope of this project is to make it easier to backup a cluster to S3 and to combine snapshots and incremental backups.
This is a fork of http://github.com/jalvz/cassandra_backups Some differences with the original, as of 9/2019:
- Added a
timestamp
option in backup.
Limitations and Recommendations
-
Backup / Restore a multi-node cluster with this and the "Node Restart Method" generally works, but many options are untested (eg: operations scoped to a single column family).
-
It is possible to backup all keyspaces at once, but restore has to be done one by one.
-
Likewise, it is possible to backup from several host with a single command, but to restore you need to execute one command per host you need to restore into.
-
Since you might need to pass the AWS credentials to this program, it is a good idea to create an AMI role with the minimum permissions required (S3 read/write).
-
Old S3 folders are not automatically cleaned up.
-
Make sure you run this program with the right user and user options, so the restored data will have the same owner and permissions as the original data you backed up.
-
Restore a backup is a delicate operation, make sure to test it exhaustively!
Install
pip install cassandra_backups
It needs to be installed in your local (where you will run the command) and in all the Cassandra nodes.
You also need lzop
. On Debian/Ubuntu:
sudo apt-get install lzop
Other dependencies that you might not have:
sudo apt-get install libffi-dev g++ libssl-dev libyaml-dev libpython2.7-dev
Usage Examples
Backup
cassandra-backups
--s3-bucket-name=cassandra_snapshots
--s3-bucket-region=us-east-1
--s3-base-path=webapp_prod
--s3-ssenc
--aws-access-key-id=XXXXXXXXXXXXXXXXX
--aws-secret-access-key=xxxxxxxxxxxxxxxxxxxxxxxxx
backup
--hosts=cassandra_node_01.domain.com,cassandra_node02_domain.com
--use-sudo=true
--user=ubuntu
--timestamp=%U
Restore
cassandra-backups
--s3-bucket-name=cassandra_snapshots
--s3-bucket-region=us-east-1
--s3-base-path=webapp_prod
--aws-access-key-id=XXXXXXXXXXXXXXXXX
--aws-secret-access-key=xxxxxxxxxxxxxxxxxxxxxxxxx
restore
--keyspace=user_cf
--host=cassandra_node_01.domain.com
--user=ubuntu
--use-sudo=true
--sudo-user=cassandra
If you want incremental backups, remember to set the flag in the cassandra.yaml
config file.
The Node Restart Method
After backup / restore operations, the Node Restart Method can be applied.
Official documentation at the time of writing for Cassandra 2.1 is here: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_backup_snapshot_restore_t.html
A general approach, after using this tool to backup and restore, goes like this:
(assuming Cassandra 3.0 installed on /var/lib/cassandra/data
)
-
Check that
/var/lib/cassandra/data/KEYSPACE
and/tmp/cassandra_restore/KEYSPACE
list the same folders. Those folders represent column families with an id. Those ids have to match. They change, for instance, when youDROP
a column family or you add a node to the cluster. Old unused ids are kept around in/var/lib/cassandra/data/KEYSPACE
and never removed as part of anynodetool
orcqlsh
command (to my knowledge). It is safe torm -rf
them. You can run this query to check the current ids in use:select id from system_schema.tables where keyspace_name = KEYSPACE;
-
Check that files in
/tmp/cassandra_restore/KEYSPACE/COLUMN_FAMILY
have an expectable size. -
Stop all the nodes, and then, one by one, run:
rm -rf /var/lib/cassandra/commitlog/*
and:cd /tmp/cassandra_restore/KEYSPACE/COLUMN_FAMILY;
ls * | xargs mv -t /var/lib/cassandra/data/KEYSPACE/COLUMN_FAMILY
for everyCOLUMN_FAMILY
(andKEYSPACE
, if needed)xargs
will make sure thatmv
works regardless of how many files you have in the directory. -
Restart all the nodes, run
nodetool repair
, and perform some sanity checks. You can (should!) do it all with a tool like Fabric to avoid typing mistakes and other nightmares.
Disclaimer
This fork has been renamed to avoid naming confusions in the Python Package Index.