Up until now I’ve been doing my backups to Amazon S3 using my s3backup
script. While it’s simple and does what I needed at the time, I’ve decided to cut some of the costs by switching to incremental backups.
I went on to define what I’m looking for, and after a short search I’ve came up with duplicity which support efficient incremental backups to Amazon S3 (among many other backends). While duplicity has an simple CLI interface, I did come across two pitfalls when using S3.
The first one is that one must export the AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
or else you get a error message from the underlying boto
library:
File "//usr/lib64/python2.5/site-packages/boto/connection.py", line 148, in __init__
self.hmac = hmac.new(self.aws_secret_access_key, digestmod=sha)
AttributeError: S3Connection instance has no attribute 'aws_secret_access_key'
The second thing to note is that is the way to specify buckets for duplicity
. Instead of
s3://<bucket-name>/<prefix>
which is used by s3cmd
(which is a great tool), it asks to specify it using
s3+http://<bucket-name>/<prefix>
Being aware of these points can save some time and frustration. In order to automate the backup process one can use cron
. For example add the following to your crontab
AWS_ACCESS_KEY_ID="<your-key-id>"
AWS_SECRET_ACCESS_KEY="<your-secret-access-key>"
0 1 * * 0 duplicity --no-encryption <folder-to-backup> s3+http://<bucket-name>/<prefix> &>> ~/backups.log
AWS_ACCESS_KEY_ID=""
AWS_SECRET_ACCESS_KEY=""
I unset the environment variables, so it won’t leak to other cron jobs unnecessarily.
Just a quick note… Duplicity now uses boto with regards to S3 commands. Therefore instead of setting the environment variables (AWS_…) one only needs to add a file in the user’s home directory called ‘.boto’ that contains the following:
[Credentials]
aws_access_key_id =
aws_secret_access_key =
Worth noting that if you have non-US buckets then you may need to add in:
–s3-use-new-style –s3-european-buckets
as cmdline params for duplicity