Tag Archives: data

Looking backwards at the future

Searching recently for emails from one of my academic colleagues, I came across one or two that appeared to have the address written backwards. He works in the Computer Lab at Cambridge, and the email was from user@uk.ac.cam.cl. What was going on?

Well, the simple answer was that my mail archives stretch back quite a long way. I have emails I received from my friend Peter just last week, but I also have some from him that arrived in the early 1990s, and this was just about the time that the UK’s academic networks were switching from the Name Resolution Scheme (NRS) they had used up to that point, over to the Domain Name System (DNS) which was becoming the standard in other parts of the world. NRS addresses started at the more general, and worked down to the more specific. Hence uk.ac.cam.cl.

Actually, email addresses in general tended to look like USER@UK.AC.CAM.CL because on mainframes EVERYTHING TENDED TO BE IN CAPITALS. But Peter was fortunate enough to be an early user of Xerox and Unix-based systems, which were more lower-casey; more cuddly California, less corporate IBM. By the start of the 90s, I too had an email address that looked like quentin.stafford-fraser@uk.ac.cam.cl.

Anyway, the fact that I still have emails from 30 years ago made me reflect, once again, on how extraordinarily successful email has been, not just as a communication medium, but as a storage format.

When I think back on other electronic documents of the time, few, if any could be read now. The companies behind my early ‘desktop publishing’ programs are no longer in existence. Microsoft Word long ago lost the the ability to open documents it had created in the past. And I imagine my documents from WordStar, WordPerfect, Microsoft Works and others would be just as challenging, if I could even find them.

But my email messages I can find. And I can read them. This is despite the fact that they have been through dozens of different email systems, created by a wide range of apps on multiple operating systems, stored on servers around the world and hard disks in my various homes and offices, and accessed through a range of different protocols (IMAP, for most of that period). Not only is my email readable, but it’s easily searchable from multiple locations using a choice of apps on any of my devices. It’s tagged with helpful metadata about authorship, time of creation and receipt, etc. I can choose to store it myself or pay others to do so. And so on. Almost no other digital storage system has proved as powerful and flexible as IMAP-accessed email.

Much of this comes, of course, from the fact that email is governed by open standards, accessed through open protocols, and often stored in non-proprietary formats. Because it is fundamentally about inter-operation, email providers have had no choice. It bugs me that I don’t have my pre-1991 emails, but that was probably because of an inadvertent slip on my part, or a hard disk crash, or something, rather than because of a fundamental limitation of the technology. If I do ever find them on some backup, I’m confident I’ll be able to include them in my archive.

This explains why, like some of my colleagues, I’ve resisted my University’s recent attempts to migrate our email accounts from our existing Open-Source-based system to Microsoft Exchange Online. It’s not because I dislike Exchange per se; after a rocky first decade or two it seems to be settling down quite nicely. But I don’t want to use a Microsoft email reader on all my devices — my own are much better, thank you — and Exchange has repeatedly shown an inability to support IMAP reliably. The messages are also not stored anywhere on a server where I could extract them by any other means in a standard format when I want to move them elsewhere. And I will want to move them elsewhere at some point; history shows me that. Fortunately, I have that power. If my email shows any danger of being locked into proprietary formats, I can simply arrange for it to be forwarded to my own servers and handle it however I like there; that’s what I’ll do if the University turns off the old system completely. And since almost everything does support IMAP, I can move emails around the world to my preferred location with a simple drag and drop.

One of my colleagues said in a recent meeting that his children don’t know what the fuss is about. Email is just something they glance at once a week to see if they’ve had any. As long as it works, they don’t mind where it comes from. Well, they may be right; perhaps it will be less important in future. But this may also be a natural tendency of the young just to focus on the immediate here and now, and the immediate future.

To me, and occasionally to other people, my email archive has turned out to be important. Something I wrote 20 years ago becomes relevant to a patent case now and earns me money because I can look back at the records. Interviewers ask me about the technologies used in a particular project and I can search back to find the answers. I forget the name of a good B&B or hotel in a particular city; email allows me to find it again. I generally had no idea, at the time, that these communications might prove to be important. But they’re a key part of the history of my life.

So here’s my question: If the things you’re doing today turn out to be important a few decades from now, what sort of digital archive would they need to be in for you to find and make use of them then? Best to start using that today, before it’s too late.

What did you do to keep warm between Thanksgiving and Christmas in the old days, daddy?

An interesting bit of data visualisation by Andy Kriebel gives some ideas.

I’d love to see how this varies for different countries/climates…

Personal Analytics

I wrote a few months back about how I was using a GPS logger to keep a record of my movements. Some people think I’m a little eccentric – I think that’s the word – for doing so.

But my data-gathering is nothing compared to Stephen Wolfram’s. In a splendid Wired article called The Personal Analytics of My Life, he discusses some of the insights he’s been able to glean from his own historical records. One inspired idea, which I confess had never occurred to me, is to run a keystroke logger; he’s captured everything he’s typed for many years. (Now, that’s data you wouldn’t want to fall into the wrong hands!)

I once thought seriously about capturing, say, once or twice a minute, the image of my screen, which I could then later OCR, search, use to recreate lost documents, etc. But other than helping Sheng Feng Li with a system that did some of this for VNC, I never took it any further. Worth reconsidering, perhaps…

Anyway, many thanks to Richard for pointing me at the Wolfram article, which is worth a read.

I suppose that another way to analyse data about your life is to do the analysis on the fly and record the results there and then. That’s called a blog.

© Copyright Quentin Stafford-Fraser