Disk Risk

Mmm. I seem to have had a lot of hard drive failures recently – Seagate drives, mostly, though, to be fair, the majority of my drives are Seagate just because my favourite supplier happens to like them, so I would expect see more failures there. The last one, though, is just 18 months old and has started making ominous clicking noises. They don’t make ’em like they used to. Stuff I’ve read online tends to suggest that it’s hard to assign blame to particular drive manufacturers, but particular models do tend to have rather different failure rates.

I do, I realise, have rather a lot of hard disks. I have three 4-bay Drobo enclosures, for a start, so that’s 12 drives even before I start adding on the miscellaneous backup disks, TV-recording disks, etc. Not to mention the internal ones in all our various machines. There must be 20-25 hard disks around here, and even though manufacturers’ specs talk about a <1% annual failure rate, studies tend to suggest that real-world figures are rather higher. One of the biggest studies, done by Google a few years ago, showed failure rates of 1.7% in the first year, rising to over 8% in the third year.

Yes, many of my drives are about that age, so if I really have 25 of them, I guess I should expect one to die every six months or so. Bother.

This suggests to me that money spent on things like my Drobo enclosures is worthwhile, because, though they are pricey, especially once you’ve filled them up with drives, any single drive failure is unlikely to be catastrophic – as disks die, you just replace them with whatever size is currently in vogue. My main Drobo currently has two 2TB drives, one 1.5TB, and a 1TB. There are those, I know, who have had less positive experiences with some Drobo kit – I found a DroboShare networking add-on to be decidedly wobbly at a past company – but in the simple use case of a Drobo plugged into a computer, I’ve been very happy and have replaced several drives without ever losing data.

The other thing that the Google study found was a strong correlation between when disks start reporting errors (which they can do using the S.M.A.R.T technology built into modern drives) and a failure soon afterwards. It’s worth, therefore, having something that checks the S.M.A.R.T status and lets you know about issues as soon as they are reported, even if the drive is still apparently working OK. On the Mac, Disk Utility can tell you about issues, but only when you go and look, so I use SMARTreporter to give more regular checks.

OK, things are getting better. There is another issue, though.

On the Mac, at least, most external drives are connected by USB or Firewire, and in general S.M.A.R.T information is not read through those interfaces – if you look in Disk Utility, you’ll see it’s ‘Unavailable’. More sophisticated enclosures like the Drobo will check the S.M.A.R.T status themselves and warn you when things look dubious, but your average USB-connected backup drive may give you no such warnings.

So I was interested to discover this kernel driver project which enhances the standard OSX USB and FireWire drivers to make S.M.A.R.T available for a lot more interfaces. (Download v0.5 here). I’ll try it on my Media Mac Mini, which has three external drives, and see how it goes…

Leave a Reply

© Copyright Quentin Stafford-Fraser