Category Archives: Linux

Brunch with Brent

Amongst the tech podcasts I enjoy while driving, dog-walking, etc are the ones from Jupiter Broadcasting.

The Self-Hosted show, in particular, discusses topic and news of interest to those who like to run some of their own IT infrastructure rather than outsourcing it all to third parties. It covers areas like backups, VPNs, media servers, and home automation (one of my current hobbies)!

Linux Unplugged keeps me in touch with Linux news. Even though I’ve been a heavy user of Linux since the days when it was first released as two floppy disk images, and I run and manage a large number of Linux servers, both personally and professionally, I haven’t really used it as my desktop operating system since Apple’s release of MacOS X gave me a Unix-based alternative, so this helps keep me in touch with developments there as well as on the back-end.

Anyway, I recommend these if you’re interested in such geeky topics; I think they’re nicely produced.

And then there’s Jupiter Extras, a feed with a range of interviews and other stuff that doesn’t really fit into any of the other streams. One of the Jupiter hosts, Brent Gervais, has a set of periodic interviews labelled ‘Brunch with Brent’, and I was delighted to be invited to join him for one of these a little while ago (published yesterday), in which he let me ramble on about everything from scuba diving to the patent system, from QWERTY keyboards to self-driving cars.

The full discussion can be found on Episode 86, and there’s a shorter extract in the middle of the latest Linux Unplugged episode too.

Healthchecks in a Docker Swarm

This is a very geeky post for those who might be Googling for particular details of Linux containerisation technologies. Others please feel free ignore! We were searching for this information online today and couldn’t find it, so I thought I’d post it myself for the benefit of future travellers…

How happy are your containers?

In your Dockerfile, you can specify a HEALTHCHECK: a command that will be run periodically within the container to ascertain whether it seems to be basically happy.

A typical example for a container running a web server might try and retrieve the front page with curl, and exit with an error code if that fails. Something like this, perhaps:

HEALTHCHECK CMD /usr/bin/curl --fail http://localhost/ || exit 1

This will be called periodically by the Docker engine — every 30 seconds, by default — and if you look at your running containers, you can see whether the healthcheck is passing in the ‘STATUS’ field:

$ docker ps
CONTAINER ID   IMAGE           CREATED          STATUS                     NAMES
c9098f4d1933   website:latest  34 minutes ago   Up 33 minutes (healthy)    website_1

Now, you can configure this healthcheck in various ways and examine its state through the command line and other Docker utilities and APIs, but I had always thought that it wasn’t actually used for anything by Docker. But I was wrong.

If you are using Docker Swarm (which, in my opinion, not enough people do), then the swarm ensures that an instance of your container keeps running in order to provide your ‘service’. Or it may run several instances, if you’ve told the swarm to create more than one replica. If a container dies, it will be restarted, to ensure that the required number of replicas exist.

But a container doesn’t have to die in order to undergo this reincarnation. If it has a healthcheck and the healthcheck fails repeatedly, a container will be killed off and restarted by the swarm. This is a good thing, and just how it ought to work. But it’s remarkably hard to find any documentation which specifies this, and you can find disagreement on the web as to whether this actually happens, partly, I expect, because it doesn’t happen if you’re just running docker-compose.

But my colleague Nicholas and I saw some of our containers dying unexpectedly, wondered if this might be the issue, and decided to test it, as follows…

First, we needed a minimal container where we could easily change the healthcheck status. Here’s our Dockerfile:

FROM bash
RUN echo hi > /tmp/t
HEALTHCHECK CMD test -f /tmp/t
CMD bash -c "sleep 5h"

and we built our container with

docker build -t swarmtest .

When you start up this exciting container, it just goes to sleep for five hours. But it contains a little file called /tmp/t, and as long as that file exists, the healthcheck will be happy. If you then use docker exec to go into the running container and delete that file, its state will eventually change to unhealthy.

If you’re trying this, you need to be a little bit patient. By default, the check runs every 30 seconds, starting 30s after the container is launched. Then you go in and delete the file, and after the healthcheck has failed three times, it will be marked as unhealthy. If you don’t want to wait that long, there are some extra options you can add to the HEALTHCHECK line to speed things up.

OK, so let’s create a docker-compose.yml file to make use of this. It’s about as small as you can get:

version: '3.8'

services:
  swarmtest:
    image: swarmtest

You can run this using docker-compose (or, now, without the hyphen):

docker compose up

or as a swarm stack using:

docker stack deploy -c docker-compose.yml swarmtest

(You don’t need some big infrastructure to use Docker Swarm; that’s one of its joys. It can manage large numbers of machines, but if you’re using Docker Desktop, for example, you can just run docker swarm init to enable Swarm on your local laptop.)

In either case, you can then use docker ps to find the container’s ID and start the healthcheck failing with

docker exec CONTAINER_ID rm /tmp/t

And so here’s a key difference between running something under docker compose and running it with docker stack deploy. With the former, after a couple of minutes, you’ll see the container change to ‘(unhealthy)’, but it will continue to run. The healthcheck is mostly just an extra bit of decoration; possibly useful, but it can be ignored.

With Docker Swarm, however, you’ll see the container marked as unhealthy, and shortly afterwards it will be killed off and restarted. So, yes, healthchecks are important if you’re running Docker Swarm, and if your container has been built to include one and, for some reason you don’t want to use it, you need to disable it explicitly in the YAML file if you don’t want your containers to be restarted every couple of minutes.

Finally, if you have a service that takes a long time to start up (perhaps because it’s doing a data migration), you may want to configure the ‘start period’ of the healthcheck, so that it stays in ‘starting’ mode for longer and doesn’t drop into ‘unhealthy’, where it might be killed off before finishing.

Optimising the size of Docker containers

Or ‘Optimizing the size of Docker containers’, in case people from America or from Oxford are Googling for it…

For Docker users, here are a couple of tricks when writing Dockerfiles which can help keep the resulting container images to a more manageable size.

Also available on Vimeo here.

Docker

Docker logoA geeky post. You have been warned.

I wanted to make a brief reference to my favourite new technology: Docker. It’s brief because this is far too big a topic to cover in a single post, but if you’re involved in any kind of Linux development activity, then trust me, you want to know about this.

What is Docker?

Docker is a new and friendly packaging of a collection of existing technologies in the Linux kernel. As a crude first approximation, a Docker ‘container’ is like a very lightweight virtual machine. Something between virtualenv and VirtualBox. Or, as somebody very aptly put it, “chroot on steroids”. It makes use of LXC (Linux Containers), cgroups, kernel namespaces and AUFS to give you most of the benefit of running several separate machines, but they are all in fact using the same kernel, and some of the operating system facilities, of the host. The separation is good enough that you can, for example, run a basic Ubuntu 12.04, and Ubuntu 14.04, and a Suse environment, all on a Centos server.

“Fine”, you may well say, “but I can do all this with VirtualBox, or VMWare, or Xen – why would I need Docker?”

Well, the difference is that Docker containers typically start up in milliseconds instead of seconds, and more importantly, they are so lightweight that you can run hundreds of them on a typical machine, where using full VMs you would probably grind to a halt after about half a dozen. (This is mostly because the separate worlds are, in fact, groups of processes within the same kernel: you don’t need to set aside a gigabyte of memory for each container, for example.)

Docker has been around for about a year and a half, but it’s getting a lot of attention at present partly because it has just hit version 1.0 and been declared ready for production use, and partly because, at the first DockerCon conference, held just a couple of weeks ago, several large players like Rackspace and Spotify revealed how they were already using and supporting the technology.

Add to this the boot2docker project which makes this all rather nice and efficient to use even if you’re on a Mac or Windows, and you can see why I’m taking time out of my Sunday afternoon to tell you about it!

I’ll write more about this in future, but I think this is going to be rather big. For more information, look at the Docker site, search YouTube for Docker-related videos, and take a look at the Docker blog.

GRUB, Ubuntu, and failed boots

Another geeky technical post. Ignore if it’s not your thing….

I wrote recently about the GRUB bootloader and how it can sometimes cause a remote or headless server not to come back online after, e.g. a kernel update.

This can happen in other situations: The configuration used on recent versions of Ubuntu is such that, if the system thinks that the last boot attempt failed, it stops it automatically booting into the same configuration again by cancelling the countdown timer, setting the number of seconds to -1, causing it to wait indefinitely at the boot menu until you decide on the best action to take. This is a sensible default, because a machine that goes into an infinite loop of reboots is doing nobody any good and puts a fair bit of strain on its own hardware.

Unfortunately, though, other things can trigger this behaviour. If you have a power fluctuation, for example, such that the machine restarts, gets part-way and then power-cycles again, you may find yourself with a machine that doesn’t come back online of its own accord.

On the most recent Ubuntu versions (12.04 with updates, and later) you can add a setting to /etc/default/grub:

GRUB_RECORDFAIL_TIMEOUT = 30

for a 30 second timeout after it has recorded a failure condition. You can use 0 if you don’t want it to pause and show the boot menu at all, but remember that it could then go into fairly fast repeated reboots if something really does go wrong.

On earlier versions, you’ll need to edit /etc/grub.d/00_header and find the line near the bottom, in the maketimeout()_ function, which sets

set timeout=-1

and change that to your preferred value.

In either case you’ll then need to run:

update-grub

to make your changes take effect.

Configuring GRUB to boot the right kernel after an upgrade

This is based on a very useful post by a chap named Arie. The credit for much of the following goes to him: I’m just reposting a summary to make it more widely discoverable.

If you upgrade your Linux installation, which on Ubuntu/Debian you might do with something like this:

sudo aptitude update
sudo aptitude safe-upgrade

then you may find that your kernel (linux-image package) and GRUB bootloader configuration have been updated. In particular, it may not now, by default, automatically select the right boot menu option when you next reboot… which you’re probably just about to do because you’ve just upgraded everything. This makes some sense because, if the upgrade failed for some reason, you may want to select a different option from the boot menu and go back to the previous kernel.

Well, it makes sense, unless, of course, you aren’t sitting in front of the machine with a screen and keyboard. If you’re upgrading a remote server, you may find that it doesn’t come back up after your reboot. In my case, the server was very remote, and I was lucky to have someone on site to press the right keys.

So, before you reboot after a new kernel installation, you may want to configure GRUB to try the new kernel automatically, and if it fails, go back to the old one.

GRUB is configured by the file /boot/grub/grub.cfg, but on a modern Ubuntu, rather than editing this big and complex file directly, you edit a few settings in /etc/default/grub, and then run update-grub to rebuild it out of lots of separate bits.

So, look at the variables in /etc/default/grub and make sure the following are set:

GRUB_DEFAULT=saved
GRUB_TIMEOUT=2
GRUB_CMDLINE_LINUX_DEFAULT=”panic=5″

This tells GRUB to use the last-saved selection, to boot it automatically after 2 seconds, and tells all kernels that they should reboot after 5 seconds if they die completely. Then you need to configure which kernel is initially the one that’s ‘saved’. Set it to be the one you know works. e.g.:

sudo grub-set-default "Ubuntu, with Linux 3.2.0-29-generic"

(You can look at the kernel you’re currently running with uname -a and find the label used to select it in /boot/grub/grub.cfg. Try grep menuentry /boot/grub/grub.cfg, for example.)

Then tell GRUB to try the new kernel on the next reboot, e.g.:

sudo grub-reboot ”Ubuntu, with Linux 3.2.0-32-new”

(This doesn’t actually do the reboot)

Then save all of these things using:

sudo update-grub

And try rebooting.

If it works OK and you come up in the new kernel, set that to be the saved default for the future:

sudo grub-set-default "Ubuntu, with Linux 3.2.0-32-new”

Using Little Computers to control Big Computers

Here’s my latest Raspberry Pi-based experiment: the CloudSwitch.

I don’t discuss the software in the video, but the fun thing is that the Pi isn’t dependent on some intermediate server – it’s using the boto module for Python to manage the AWS resources directly.

I decided to build the app slightly differently from the way I would normally approach a little project like this. I knew that, even for this very simple system, I would have several inputs and outputs of various kinds, some of them with big delays, and I wanted to make sure that timing hiccups or race conditions didn’t ever leave the lights displaying something that didn’t represent reality.

So this is only a single python file, but it runs several threads – one that looks for button presses, one that monitors and controls the Amazon server, and one that handles the lights – including flashing them in various patterns. They interact with the main thread using ZeroMQ messages, which is a lovely way to do inter-thread communications without all that nasty messing about with semaphores and mutexes.

Update: Here’s the very simple circuit diagram. The illuminated buttons I used have LEDs which take a little more power than the Raspberry Pi can really drive, so I put a couple of NPN transistors in there. It really doesn’t matter too much what they are – I used the 2N3904.

Raspberry Pi Webcam Viewer

I finally got a chance to play with my RaspberryPi, so I threw together a quick experiment.

 

 

Update: A few people have asked me for a little more information. I’m happy to make the source code available, but it’s not very tidy and a bit specific to my situation… however, to answer some of the questions:

The enclosure for the Raspberry Pi comes from SK Pang Electronics, and it’s perfect for my needs. You can buy just the perspex cover, but they do a nice Starter Kit which includes the breadboard, some LEDs, resistors and the pushswitch. Definitely recommended.

For the graphics, I used the PyGame library, which has the advantage of being cross-platform: you can use it with a variety of different graphics systems on a variety of different devices. On most Linux boxes, you’d normally run it under X Windows, but I discovered that it has various drivers that can use the console framebuffer device directly. This makes for a quicker startup and lighter-weight system, though I imagine it probably has less access to hardware acceleration, so it’s probably not the way to go if your graphics need high performance. You can read about how to get a PyGame display ‘surface’ (something you can draw on) from the framebuffer, in a handy post here.

To load an image from a file in PyGame is easy: you do something like this:

            im_surf = pygame.image.load(f, "cam.jpg")

where ‘f’ is an open file, and the ‘cam.jpg’ is just an invented filename to give the library a hint about the type of file it’s loading.

Now, with a webcam, we need to get the image from a URL, not from a file. It’s easy to read the contents of a URL in Python. You just need something like:

            import urllib
            img = urllib.urlopen(img_url).read()

but that will give you the bytes of the image as a string. If we want to convert it into a PyGame surface, we need to make it look more like a file. Fortunately, Python has a module called StringIO which does just that: allows you to treat strings as if they were files. So to load a JPEG from img_url and turn it into a PyGame surface which you can blit onto the screen, you can do something like:

          f = StringIO.StringIO(urllib.urlopen(img_url).read())
          im_surf = pygame.image.load(f, "cam.jpg")

I’ll leave the remaining bits as an exercise for the reader!

If you like this, you might also like my CloudSwitch

Remote keyboard and mouse for your RaspberryPi

Warning – technical post ahead…

My shiny new RaspberryPi came through the door this week, and the first thing I noticed was that it was the first computer I’d ever received where the postman didn’t need to ring the bell to deliver it.

The next thing I discovered, because it came sooner than expected, was that I was missing some of the key bits needed to play with it. A power supply was no problem – I have a selection of those from old phones and things, and I found a USB-to-micro-USB cable. Nor was a network connection tricky – I have about as many ethernet switches as I have rooms in the house.

A monitor was a bit more challenging – I have lots of them about, but not with HDMI inputs, and I didn’t have an HDMI-DVI adaptor. But then I remembered that I do have an HDMI monitor – a little 7″ Lilliput – which has proved to be handy on all sorts of occasions.

The next hiccup was an SD card. You need a 2GB or larger one on which to load an image of the standard Debian operating system. I had one of those, or at least I thought I did. But it turned out that my no-name generic 2GB card was in fact something like 1.999GB, so the image didn’t quite fit. And a truncated filesystem is not the best place to start. But once replaced with a Kingston one – thanks, Richard – all was well.

Now, what about a keyboard and mouse?

Well, a keyboard wasn’t a problem, but it didn’t seem to like my mouse. Actually, I think the combined power consumption probably just exceeded the capabilities of my old Blackberry power supply, which only delivers 0.5A.

If you’re just doing text-console stuff, then this isn’t an issue, because it’s easy to log into it over the network from another machine. It prints its IP address just above the login prompt, so you can connect to it using SSH, if you’re on a real computer, or using PuTTY if you’re on Windows.

But suppose you’d like to play with the graphical interface, even though you don’t have a spare keyboard and mouse handy?

Well, fortunately, the Pi uses X Windows, and one of the cunning things about X is that it’s a networked display system, so you can run programs on one machine and display them on another. You can also send keyboard and mouse events from one machine to another, so you can get your desktop machine to send mouse movements and key presses from there. On another Linux box, you can run x2x. On a Mac, there’s Digital Flapjack’s osx2x (If that link is dead, see the note at the end of the post).

These both have the effect of allowing you to move your mouse pointer off the side of the screen and onto your RaspberryPi. If you have a Windows machine, I don’t think there’s a direct equivalent. (Anyone?) So you may need to set up something like Synergy, which should also work fine, but is a different procedure from that listed below. The following requires you to make some changes to the configurations on your RaspberryPi, but not to install any new software on it.

Now, obviously, allowing other machines to interfere with your display over the network is something you normally don’t want to happen! So most machines running X have various permission controls in place, and the RaspberryPi is no exception. I’m assuming here that you’re on a network behind a firewall/router and you can be a bit more relaxed about this for the purposes of experimentation, so we’re going to turn most of them off.

Running startx

Firstly, when you log in to the Pi, you’re normally at the command prompt, and you fire up the graphical environment by typing startx. Only a user logged in at the console is allowed to do that. If you’d like to be able to start it up when you’re logged in through an ssh connection, you need to edit the file /etc/X11/Xwrapper.config, e.g. using:

sudo nano /etc/X11/Xwrapper.config

and change the line that says:

allowed_users=console

to say:

 allowed_users=anybody

Then you can type startx when you’re logged in remotely. Or startx &, if you’d like it to run in the background and give you your console back.

Allowing network connections

Secondly, the Pi isn’t normally listening for X events coming over the network, just from the local machine. You can fix that by editing /etc/X11/xinit/xserverrc and finding the line that says:

exec /usr/bin/X -nolisten tcp "$@"

Take out the ‘-nolisten tcp’ bit, so it says

exec /usr/bin/X "$@" 

Now it’s a networked display system.

Choosing who’s allowed to connect

There are various complicated, sophisticated and secure ways of enabling only very specific users or machines to connect. If you want to use those, then you need to go away and read about xauth.

I’m going to assume that on your home network you’re happy for anyone who can contact your Pi to send it stuff, so we’ll use the simplest case where we allow everything. You need to run the ‘xhost +‘ command, and you need to do it from within the X environment once it has started up.

The easiest way to do this is to create a script which is run as part of the X startup process: create a new file called, say:

/etc/X11/Xsession.d/80allow_all_connections

It only needs to contain one line:

/usr/bin/xhost +

Now, when the graphical environment starts up, it will allow X connections across the network from any machine behind your firewall. This lets other computers send keyboard and mouse events, but also do much more. The clock in the photo above, for example, is displayed on my Pi, but actually running on my Mac… however, that’s a different story.

For now, you need to configure x2x, osx2x, or whatever is sending the events to send them to ip_address:0, where ip_address is the address of your Pi. I’m using osx2x, and so I create the connection using the following:

Once that’s done, I can just move the mouse off the west (left-hand) side of my screen and it starts moving the pointer on the RaspberryPi. Keyboard events follow the pointer. (I had to click the mouse once initially to get the connection to wake up.)

Very handy, and it saves on desk space, too!

Update: Michael’s no longer actively maintaining binaries for osx2x, and the older ones you find around may not work on recent versions of OS X. So I’ve compiled a binary for Lion(10.7) and later, which you can find in a ZIP file here. Michael’s source code is on Github, and my fork, from which this is built, is here.

After the $100 laptop, the $50 desktop?

It’s almost exactly 10 years since we started the Ndiyo project, with the aim of providing computing access to people for something “closer to the cost of a VGA lead than the cost of a computer”. Ndiyo has now formally closed, but it led to many other activities, including the founding of DisplayLink, and successful past projects in collaboration with the GSM Association, No-PC, and others.

Today, there’s some more good news.

Towards the end of Ndiyo’s life we started to experiment with a model we called ‘Hubster‘ – the name coming from using a USB hub as the core of a thin-client terminal, something made possible once DisplayLink’s evolution of the Ndiyo technology allowed monitors, as well as keyboards and mice, to be connected over USB. The idea is to share the power, cost, and the carbon footprint of a PC between two or more users at once, simply by plugging in enough USB peripherals to give the extra users access to it. This is important for everybody, but especially for the poorer parts of the world where the cost of owning one PC per person is prohibitive.

Well, over the last few years, Bernie Thompson at Plugable.com in Seattle has been beavering quietly away to make this more of a reality, by providing DisplayLink-based terminals at reasonable prices and by maintaining the Open Source software to drive them. There are two bits of good news coming from his recent efforts:

  • First, he’s been working with RedHat to get support for USB terminals built in to the standard distribution of Fedora 17. We’re getting very close now to the ideal situation where you can turn any Linux box into a multi-user box simply by plugging in enough components for a new user, and a new login prompt will appear when you do so. Imagine, say, a disaster-relief information centre where one person with a laptop connected to a satellite link can easily provide access for half a dozen members of the team just by plugging them in. And it’s all Open Source. Zero extra software cost per user.
  • Secondly, in an attempt to get the cost of these terminals down and support ongoing development of the Open Source components, he’s launched a Kickstarter project called ‘The $50 Computer‘.

Some may, quite sensibly, ask how this compares to the RaspberryPi, which is, after all, even cheaper, and is a standalone machine. Well, this one comes with a box!

No, seriously, they are both excellent projects – I have a RaspberryPi on order, too – but they fill different roles. RaspberryPi, in the early years at least, will be about teaching people the basics of how computing works.
Bernie and the Plugable team are creating a system where providing fully-featured applications to multiple users at very low cost is something that a non-technical user can do simply by plugging in USB devices.

In some situations, perhaps in an internet cafe, you may just want to give extra users access to a Chrome browser. But with this system you also have the option of providing them with OpenOffice, with Scrivener, with Blender, with Corel Aftershot Pro, with Sublime Text, with Skype, with… well you get the idea! And, of course, with access to however many giga- or terabytes of storage you care to put in the PC.

I think the Kickstarter plan is a great one – I wish it had been around when we started Ndiyo.

Please support it if you can… Even if you don’t need more affordable computing devices, for your kids, your school, your internet cafe, your office, there are millions who do. Billions, in fact.

Every $10 we can shave off the cost of access to IT makes it accessible to many thousands of new people globally… and now you have a chance to help.

Speed up slow console in VirtualBox

This is one of those geeky posts that is here for the benefit of (a) my failing memory – so I can search for it later – and for (b) anyone who happens to Google for the right keywords.

At Camvine, most of our development is done on Linux, but many of us have Macs as our preferred desktop machines. So we regularly install Ubuntu Linux Server as a virtual machine using VirtualBox. We’re not interested in a graphical interface to the VM – the console is fine. Unfortunately, the default video driver that Ubuntu uses when running under VirtualBox is painfully slow, to the extent that we usually minimize the window and SSH into the machine.

Fortunately, the guys on this thread found a solution – thanks everyone!

On the VM, edit /etc/modprobe.d/blacklist-framebuffer.conf and add a line:

blacklist vga16fb

Reboot the VM and speedy scrolling should be yours again.

(Of course, you may still want to ssh in to get more convenient cut & paste, etc., but it’s nice to have a speedy console anyway, and you may not always have network access to the VM.)

Flushing your DNS cache

OK – a really geeky little tutorial, this one. If you’ve never felt the urge to flush your DNS cache, then don’t worry, that’s quite normal, many people live long and happy lives without ever doing so, and you should feel free to ignore this post and go about your other business.

A little bit of background…

DNS lookups, as many of my readers will be aware, are cached. The whole DNS system would crumble and fall if, whenever your PC needed to look up statusq.org, say, it had to go back to the domain’s name server to discover that the name corresponded to the IP address 74.55.156.82. It would need to do before it could even start to get anything from the server, so every connection would also be painfully slow. To avoid this, the information, once you’ve asked for it the first time, is probably cached by your browser, and your machine, and, if you’re at work, your company’s DNS server, and their ISP’s DNS server… and it’s only if none of those know the answer that it will go back to the statusq.org domain’s official name server – GoDaddy, in this case – to find out what’s what.

Of course, all machines need to do that from time to time, anyway, because the information may change and their copy may be out of date. Each entry in the DNS system therefore can be given a TTL – a ‘Time To Live’ – which is guidance on how frequently the cached information should be flushed away and re-fetched from the source.

On Godaddy, this defaults to one hour – really rather a short period, and since they’re the largest DNS registrar, this probably causes a lot of unnecessary traffic on the net as a whole. If you’re confident that your server is going to stay on the same IP address for a good long time, you should set your TTLs to something more substantial – perhaps a day, or even a week. This will help to distribute the load on the network, reduce the likelihood of failed connections, and, on average, speed up interactions with your server. The reason people don’t regularly set their TTL to something long is that, when you do need to change the mapping, perhaps because your server has died and you’ve had to move to a new machine, the old values may hang around in everybody’s caches for quite a while, and that can be a nuisance.

It’s useful to think about this when making DNS changes, because you, at least, will want to check fairly swiftly that the new values work OK. There’s nothing worse than making a typo in the IP address of an entry with a long TTL, and having all of your customers going to somebody else’s site for a week.

So, if you know you’re going to be making changes in the near future, shorten the TTL on your entries a few days in advance. Machines picking up the old value will then know to treat it as more temporary. You can lengthen the TTLs again once you know everything is happy.

Secondly, just before you make the change, try to avoid using the old address, for example by pointing your browser at it. This goes for new domains, too – the domain provider will probably set the DNS entry to point at some temporary page initially – and if you try out your shiny new domain name immediately, you’ll then have to wait a couple of hours before you can access your real server that way. Make the DNS change immediately, before your machine has ever looked it up and so put it in it cache and any intervening ones.

Finally, once you’ve made a change, you may be able to encourage your machines to use the new value more quickly by flushing their local caches. This won’t help so much if they are retrieving it via an ISP’s caching proxy, for example, but it’s worth a try.

Here’s how you can use the command line to flush the cache on a few different platforms. Please feel free to add any others in the comments:

On recent versions of Mac OS X:

sudo dscacheutil -flushcache

On older versions of OS X:

sudo lookupd -flushcache

On Windows:

ipconfig /flushdns

On Linux, if your machine is running the ncsd daemon:

sudo /etc/rc.d/init.d/nscd restart

If you’re actually running a DNS server, for example for your organisation’s local network:

On Linux running bind9:

rndc flush

On Linux running bind8:

ndc flush

On Ubuntu/Debian running named:

/etc/init.d/named restart

© Copyright Quentin Stafford-Fraser