Category Archives: Linux

Duplicate mail messages

In my various shufflings, copyings, archivings of email messages between my IMAP folders, I often end up with duplicates.

Sometimes, a copy or move goes badly wrong and I end up with hundreds of duplicates.

Many years ago I wrote a bit of Java code which would find and remove duplicates, but I’ve now converted it to a Python script and released it as Open Source, in case it’s useful to anyone else.

You can find IMAPdedup here.

Feedback and improvements welcome!


serverbarMichael has made his rather nice ServerBar utility available.

If you have a Mac and you manage Unix-type machines (including other Macs, of course), this might be for you. It only really does one thing, but it does it well – it shows you the load on your remote machines – and it gives you a convenient shortcut (by clicking on the graph) to a terminal on any machine. If you know what SSH is, this might be of interest.



vnc2dl2Warning – for geeks only…

I’ve just posted an alpha version of VNC2DL on github.

This is a VNC viewer which uses the new Open Source library from DisplayLink to display a VNC session on a USB-connected display, rather than in a window.

Just in case it’s useful to anyone…


“I would show you this on my laptop”, said a visitor to our company recently, “but it would take forever to boot up”.

And I realised how long I’d been living in a Mac world: for the last eight or nine years I’ve had a laptop where you open the lid and start typing pretty much immediately. (Camvine is an all-Mac shop except for the servers, which are Linux, and stay on all the time anyway.)

The slow start-up (and even rather painful resume-from-suspend) that people in the Windows world often experience has led to some modern machines having a minimal Linux installed alongside Windows, so you don’t have to wait for your entire world to load if you just want to check something quick on the web. Chris Nuttall, writing in the FT techblog, seems to be quite impressed with Presto.

Drop it in the box

I’ve only just started playing with Dropbox, but it looks very cool.

It’s what iDisk should have been. Software for Windows, Linux and Mac will create a Dropbox folder on your machine. Anything you drop on that folder is efficiently and securely synchronised to all other machines connected to the same account. It keeps past versions of updated files for you. The storage behind the scenes is Amazon’s S3 service. And if you’re using less than 2GB, Dropbox is free.

Here’s a more detailed write-up by Ryan Paul.

Some Linux backup utilities

For some years I’ve been backing up my various Linux-based servers, websites etc using a custom script which makes incremental tar-based backups of key directory hierarchies, dumps some MySQL databases, and then copies the lot to a remote machine using scp or rsync. We run this each night using cron. It’s worked well, but it’s becoming rather spaghetti-like since we run some version of it on several machines, copying stuff to several other machines. And the process of pruning old backups to keep disk usage under control at both the sources and the destinations is somewhat haphazard.

So I’ve been looking at various other backup systems which may do a more manageable job. The big systems in the Unix world are the venerable Amanda and the more recent but highly-respected Bacula. I may do something based around Bacula in due course, but for now I needed something quick. Here’s a quick rundown of some useful backing-up scripts. They all make use of rsync, or the rsync algorithm, in some way, but do more than just copy from A to B.

You can think of this as an rsync which keeps some history. The destination ends up with a copy of the source but also has a subdirectory containing reverse-diffs so you can get back to earlier versions. This is rather nice, I think, and it can pull the backups from or push them to a remote machine, though it does need to be installed at both ends. It’s mostly Python and relies on librsync. The standard Ubuntu rdiff-backup package isn’t very recent so I built and installed it by hand.
This looks good and is being actively maintained. It’s a bit like rdiff-backup but focuses on encryption, and uses incremental tar-based backups. For me, the downside was that it’s push-only – you run it on one machine to send backups to another – and I was more keen on pulling from several machines under centralised control. Update: I later discovered that pushing can have some real advantages. One is that it can often be easier to manage the permissions of the backup user on the machine where the data exists. It might be a cron job run as root, for example. Another is that you may not always be able to install software or cron jobs on the machine where you want to store the backups. Also, duplicity has some interesting backends for things like Amazon S3. I’m using duplicity more now than when I first wrote this.
In the short term, I think this is the one that will suit me best. You can create categories like ‘hourly’, ‘daily’, ‘monthly’, and specify how many of each you’d like kept. It creates complete copy directories for each copy of each one, but where the files haven’t changed they are simply hard links to the previous ones, so it’s pretty efficient on space. And a single configuration file can perform lots of remote and local backups. I suppose the downside is that the hard-link based architecture limits the range of filesystems on which you can store your backups, but if you’re firmly in the Unix world this seems to work rather well.

Just in case anyone else is looking…

Update: Emanuel Carnevale reminded me about:

Unison is a bit like rsync but does bi-directional synchronisation – it can cope with changes being made at either end. I hadn’t really thought of it as a backup tool, but – perhaps because two-way synchronisation can sometimes do unexpected things – it does have the ability to keep backups of any files it replaces. One more option if you need it…!

Falling markets

My first computer, a Sinclair ZX81, cost £69.95. Since then, every computer I’ve owned has cost more – usually substantially more. Until today.

Today I bought a new laptop for £179 inc. VAT, which in real terms is less than my ZX81 of 27 years ago. Progress at last! And this one I didn’t have to plug into a cassette deck and an elderly black-and-white TV!

It’s an Acer Aspire One, and I have to say that, so far, I’m really impressed. It runs OpenOffice, Firefox, Thunderbird and Skype very nicely, and it includes a few things like a camera and microphone that work remarkably well – I’ve just had a video-Skype call with my pal Jason while walking around the house.


Of course, it has some limitations – it boots up very much faster than any Windows machine I’ve ever seen but it’s not like a Mac’s almost instantaneous wake-up from sleep. I couldn’t write this post on it but only because it can’t read the RAW-format images from my SLR, and I couldn’t watch movie trailers on the Apple site because you can’t get Quicktime for Linux. But the number of things it can do rather well are remarkable, and I could happily survive with it for a weekend when I didn’t want to carry anything heavier, or use it to catch up on news at the breakfast table.

It may not be a Mac, but it’s certainly not a ZX81!

Ubuntu Netboot installation

If you have an existing Linux machine (already running GRUB) and you want to install a fresh version of Ubuntu on it, this page may be handy. All you need to do is download a kernel and an initrd file, reboot and issue a couple of GRUB command lines, and you can install everything else over the network from the Ubuntu repositories.

I’ve just got a new hosted server which came with 6.06 installed, and I wanted to wipe it and start with a clean 8.04. This was a very quick and easy way to do it, especially since I didn’t have easy access to the machine’s CD/DVD drive.

Ahead of its time?

In 2001 at the AT&T Labs in Cambridge, we created a system we called the Broadband Phone:

Basically, it was a Linux-based VOIP phone with a VNC viewer and touch screen built in to it, and we built a GUI toolkit which rendered directly over the network in VNC. A standard Dell PC operated as the phone exchange (I wish we’d had Asterisk then!) and also provided the graphics for a variety of specially-written applications. It drove about 100 phones without any trouble, and we used this as our internal phone system in the lab for some time. The plan was to spin out a company based around the technology, but this was 2001, and you couldn’t get funding for new companies, whatever you did!

Anyway, at one point I created a cordless version based around a Compaq iPaq. I came across a publicity photo of it recently, and it took me a moment to realise why it looked so familiar:

Perhaps we were just too far ahead of the curve… 🙂

You can find my original pages about the Broadband Phone project here on the Internet Archive.

SSH ProxyCommand

Here’s an exceedingly useful feature of SSH which I only discovered recently.

Imagine that you have a single ‘gateway’ machine on your network which you can connect to from outside using SSH; I do this all the time. You can then use that machine to connect to other machines inside your network in a variety of ways: using the port-forwarding abilities of SSH (the -L and -R options), for example, or simply by running another SSH command from the gateway machine once you’ve connected to it.

But there’s a much tidier way to do it, using the ProxyCommand option.

To connect to, just add something like the following to your ~/.ssh/config:

     ProxyCommand ssh exec nc %h %p

then you can ssh directly to from outside. SSH will connect to the gateway machine and run ‘nc’ to forward the SSH session to the internal machine.

And, of course, you can use it for things layered over SSH, like checkouts from Git or Subversion repositories. Very tidy! I also sometimes add -C to the ssh command so that any access done this way is automatically compressed, even in situations where it was hard to specify that explicitly.

If you’re unlucky enough to find yourself stuck behind a web proxy with no other outgoing access, one very nice-looking use of ProxyCommand is the Corkscrew utility by Pat Padgett.

Hope this is helpful to someone!

Update: there are a few useful extra tips in the comments.

Build version numbering with Git

The ‘Git’ version-control system is used to develop the Linux kernel, amongst other things, and it’s the most powerful one I’ve used. (And I’m old enough to remember SCCS :-)) It takes some work to get your head around Git, but we’re now using it to develop our CODA system, and it’s been well worth it.

Michael came up with a nice way to number our build versions and has written it up on his blog – might be of interest if you’re using Git already.

If you aren’t, Randall Schwarz’s talk is a good intro.

Geek gifts 1

If you know somebody geeky enough to run a Linux desktop, they’d probably like a Tux Droid. It’s like a Nabaztag for hackers…

© Copyright Quentin Stafford-Fraser