Some Linux backup utilities

For some years I’ve been backing up my various Linux-based servers, websites etc using a custom script which makes incremental tar-based backups of key directory hierarchies, dumps some MySQL databases, and then copies the lot to a remote machine using scp or rsync. We run this each night using cron. It’s worked well, but it’s becoming rather spaghetti-like since we run some version of it on several machines, copying stuff to several other machines. And the process of pruning old backups to keep disk usage under control at both the sources and the destinations is somewhat haphazard.

So I’ve been looking at various other backup systems which may do a more manageable job. The big systems in the Unix world are the venerable Amanda and the more recent but highly-respected Bacula. I may do something based around Bacula in due course, but for now I needed something quick. Here’s a quick rundown of some useful backing-up scripts. They all make use of rsync, or the rsync algorithm, in some way, but do more than just copy from A to B.

rdiff-backup
You can think of this as an rsync which keeps some history. The destination ends up with a copy of the source but also has a subdirectory containing reverse-diffs so you can get back to earlier versions. This is rather nice, I think, and it can pull the backups from or push them to a remote machine, though it does need to be installed at both ends. It’s mostly Python and relies on librsync. The standard Ubuntu rdiff-backup package isn’t very recent so I built and installed it by hand.
duplicity
This looks good and is being actively maintained. It’s a bit like rdiff-backup but focuses on encryption, and uses incremental tar-based backups. For me, the downside was that it’s push-only – you run it on one machine to send backups to another – and I was more keen on pulling from several machines under centralised control. Update: I later discovered that pushing can have some real advantages. One is that it can often be easier to manage the permissions of the backup user on the machine where the data exists. It might be a cron job run as root, for example. Another is that you may not always be able to install software or cron jobs on the machine where you want to store the backups. Also, duplicity has some interesting backends for things like Amazon S3. I’m using duplicity more now than when I first wrote this.
rsnapshot
In the short term, I think this is the one that will suit me best. You can create categories like ‘hourly’, ‘daily’, ‘monthly’, and specify how many of each you’d like kept. It creates complete copy directories for each copy of each one, but where the files haven’t changed they are simply hard links to the previous ones, so it’s pretty efficient on space. And a single configuration file can perform lots of remote and local backups. I suppose the downside is that the hard-link based architecture limits the range of filesystems on which you can store your backups, but if you’re firmly in the Unix world this seems to work rather well.

Just in case anyone else is looking…

Update: Emanuel Carnevale reminded me about:

Unison
Unison is a bit like rsync but does bi-directional synchronisation – it can cope with changes being made at either end. I hadn’t really thought of it as a backup tool, but – perhaps because two-way synchronisation can sometimes do unexpected things – it does have the ability to keep backups of any files it replaces. One more option if you need it…!

Enjoyed this post? Why not sign up to receive Status-Q in your inbox?

6 Comments

Nice post, too few people understands the importance of backing up.

Have you ever tried Unison? [http://www.cis.upenn.edu/~bcpierce/unison/]
It’s really interesting, I had the same setup as your spaghetti one, but I had replaced unison to rsync.

Ah – yes, thanks Emanuel – I’d forgotten Unison. I tried it some time ago and it’s quite useful; I think it’s almost the only option if you want to synchronise in both directions.

BakBone Software also backs up Linux-based servers. http://www.bakbone.com

Thanks for this helpful infromation and I’ve scraped it with scrapebook(firefox addon) for studying. I am new to linux, can you recommend me books about linux? Thanks.

[…] Status-Q » Blog Archive » Some Linux backup utilities […]

Got Something To Say:

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

*

© Copyright Quentin Stafford-Fraser