Category Archives: Internet

Posting for Posterity

I’ve written before (e.g. in May) about the importance of the Internet Archive, which I was fortunate enough to visit in its early days. It’s a hugely valuable resource for many reasons, not least in giving some protection against link rot through its ‘Wayback Machine‘.

What I’m embarrassed to say I didn’t know until recently, or had forgotten, is that there is also a UK Web Archive at webarchive.org.uk . It’s a very nicely-done collaborative project of the UK Legal Deposit Libraries, and performs a similar task for UK-based websites.

It’s been going for 10 years now, which is a good span but not nearly as long as the Internet Archive, so if, say, you were feeling gloomy about the situation in the UK and needed to be cheered up, you could go and look at the old News of The World site and be grateful that it ceased to exist 12 years ago.  For that, though, you would need to go to the Internet Archive.

The UKWA is a great initiative,and worth supporting. If you have a UK-based site which isn’t already indexed, let them know. It’s another good way to try and ensure it outlives you, and they try to update their copy at least annually.

And if you want to know more about the UK’s Legal Deposit Libraries which are behind the project, Tom Scott (of course) has a nice new video.  

 

The day the internet died

optical fibre cut by hedge trimmer

Oops. At the start of the holiday weekend, I managed to cut the optical fibre providing our internet connection. I realise that it’s one of our most important cables, one of the thinnest and most vulnerable, and pretty much the only one we have that I’m incapable of repairing myself!

In case you’re wondering, the hedge wasn’t there when the fibre was installed, and had since grown up to cover it. I would have been alright if it weren’t for the fact that optical fibres can’t be bent around tight corners, and so had to bulge away from the wall before going through it…

A day that shall live on in infamy. Though not as much infamy as it might have had in the absence of phone-based backup connections.

Let’s ask Quentin (RIP)!

I’ve been contemplating how to achieve immortality.  I’m sure you often do the same thing over breakfast on a sunny morning.

It has often occurred to me that, since so much of my output is in digital form, it may vanish without a trace once I’m gone, since nobody will be paying the hosting fees. Had I written more things that had made it into print, they might at least have lingered in the dark recesses of a library somewhere for a rather longer period.  Perhaps even gather dust on one or two people’s bookshelves.  Probably nobody would ever read them, but it would be comforting to know that they were there!

In reality, of course, digital data should last a lot longer, as long as it’s maintained.  If I were really wealthy and cared enough about this vanity project, I would leave behind an invested sum big enough to pay for web hosting in perpetuity plus one day per year of an IT consultant’s time to update the formats, check the backups, etc.

Fortunately, though, I have some hope that my 20+ years of blog posts won’t just vanish into the ether when Rose forgets to pay the web hosting bill after I’m gone, partly because there are periodic snapshots on the wonderful Internet Archive. (Here’s what Status-Q looked like in early 2001.)  

Brewster Kahle, the man behind the Archive, was good enough back in 2005 to give me a tour of their headquarters, which was then located in the Presidio of San Francisco. Brewster’s an inspiring guy doing important work, and a much better use for my hypothetical legacy would be to leave it to them.    I wonder if they would guarantee, in exchange, to keep my memory alive, in much the same way that donors to religious organisations used to get prayers said in perpetuity for their departed souls….

But then I started wondering about the next stage.

If you were to train an AI system on all of my blog posts, YouTube videos, academic papers, podcast & media interviews, etc… how convincingly could you get it to respond to questions in the way that I would have done?  Perhaps a deepfake video character could even give future interviews on my behalf?  I can’t quite decide whether that’s exciting, or thoroughly creepy.

But I tell you what… I do think it’s inevitable.  

Perhaps not for me: I imagine that when I’m gone a few friends will shed a quiet tear and everyone else will breathe a huge sigh of relief and switch the servers off.   But for others; those more prolific, more wise, more entertaining, I think this is bound to happen.  You will be able to ask questions of Mother Theresa, or Christopher Hitchens, or the Dalai Lama, or Warren Buffet.  You’ll be able to get Handel to compose your wedding march, and Peter Ustinov to speak at the reception afterwards.  And for a bit of spiritual advice, you could always ask God. Or a ChatGPT engine trained exclusively on his revelations to mankind from whichever source you prefer them.

Today’s systems would, of course, do a very fallible job, but what will the AI systems be like in 100 years’ time?   That will only help you, of course, if they still have access to your data, in non-proprietary, open, standard formats.  In the past, if you had sufficient wealth, you might have chosen to spend it on Cryonics.   I can’t help feeling that to achieve immortality now, a better bet would be to spend it on good, globally-accessible backups of your data.

 

 

How not to design the front page of your website

I seem to be seeing more and more of those pop-up windows that, within seconds of you first visiting a website, ask whether you immediately want to fill in your email address so they can send you spam.  

Usually, it happens before I’ve even read the first sentence, let alone the first paragraph, so my reaction to “Would you like to receive updates from us?” is generally, “How the hell should I know? I’ve only seen your URL so far!”

So my curmudgeonly questions of the morning are:

  • Does anyone, anywhere, ever fill these in?  My basic respect for human intelligence would suggest not, but I suppose roughly half the world has below-average IQ.
  • Who are the fools who, when planning a shiny new website, decide that immediately obscuring it with one of these, and simultaneously annoying every new visitor to your site, is a good idea?
  • Are people who work in marketing actually the kind of people who would fill these in themselves?  Or do they just think everyone else is an idiot?  Either option would not reflect well on them, which leads me to an inevitable conclusion and final question.
  • Why do so many of those people with below-average intelligence work in marketing?

 

Sign of the times: might ChatGPT re-invigorate GPG?

It’s important to keep finding errors in LLM systems like ChatGPT, to remind us that, however eloquent they may be, they actually have very little knowledge of the real world.

A few days ago, I asked ChatGPT to describe the range of blog posts available on Status-Q. As part of the response it told me that ‘the website “statusq.org” was founded in 2017 by journalist and author Ben Hammersley.’ Now, Ben is a splendid fellow, but he’s not me. And this blog has been going a lot longer than that!

I corrected the date and the author, and it apologised. (It seems to be doing that a lot recently.) I asked if it learned when people corrected it, and it said yes. I then asked it my original question again, and it got the author right this time.

Later that afternoon, it told me that StatusQ.org was the the personal website of Neil Lawrence.  

Unknown

Neil is also a friend, so I forwarded it to him, complaining of identity theft!

A couple of days later, my friend Nicholas asked a similar question and was informed that “based on publicly available information, I can tell you that Status-Q is the personal blog of Simon Wardley”.  Where is this publicly-available information, I’d like to know!

The moral of the story is not to believe anything you read on the Net, especially if you suspect some kind of AI system may be involved.  Don’t necessarily assume that they’re a tool to make us smarter!

When the web breaks, how will we fix it?

So I was thinking about the whole question of attribution, and ownership of content, when I came across this post, which was written by Fred Wilson way back in the distant AI past (ie. in December).  An excerpt:

I attended a dinner this past week with USV portfolio founders and one who works in education told us that ChatGPT has effectively ended the essay as a way for teachers to assess student progress. It will be easier for a student to prompt ChatGPT to write the essay than to write it themselves.

It is not just language models that are making huge advances. AIs can produce incredible audio and video as well. I am certain that an AI can produce a podcast or video of me saying something I did not say and would not say. I haven’t seen it yet, but it is inevitable.

So what do we do about this world we are living in where content can be created by machines and ascribed to us?

His solution: we need to sign things cryptographically.

Now this is something that geeks have been able to do for a long time.  You can take a chunk of text (or any data) and produce a signature using a secret key to which only you have access.  If I take the start of this post: the plain text version of everything starting from “It’s important” at the top down to “sign things cryptographically.” in the above paragraph, I can sign it using my GPG private key. This produces a signature which looks like this:

-----BEGIN PGP SIGNATURE-----
iQEzBAEBCgAdFiEENvIIPyk+1P2DhHuDCTKOi/lGS18FAmRJq1oACgkQCTKOi/lG
S1/E8wgAx1LSRLlge7Ymk9Ru5PsEPMUZdH/XLhczSOzsdSrnkDa4nSAdST5Gf7ju
pWKKDNfeEMuiF1nA1nraV7jHU5twUFITSsP2jJm91BllhbBNjjnlCGa9kZxtpqsO
T80Ow/ZEhoLXt6kDD6+2AAqp7eRhVCS4pnDCqayz0r0GPW13X3DprmMpS1bY4FWu
fJZxokpG99kb6J2Ldw6V90Cynufq3evnWpEbZfCkCl8K3xjEwrKqxHQWhxiWyDEv
opHxpV/Q7Vk5VsHZozBdDXSIqawM/HVGPObLCoHMbhIKTUN9qKMYPlP/d8XTTZfi
1nyWI247coxlmKzyq9/3tJkRaCQ/Aw==
=Wmam<
-----END PGP SIGNATURE-----

If you were so inclined, you could easily find my corresponding public key online and use it to verify that signature.  What would that tell you?

Well, it would say that I have definitely asserted something about the above text: in this case, I’m asserting that I wrote it.  It wouldn’t tell you whether that was true, but it would tell you two things:

  • It was definitely me making the assertion, because nobody else could produce that signature.  This is partly because nobody else has access to my private key file, and even if they did, using it also requires a password that only I know. So they couldn’t  produce that signature without me. It’s way, way harder than faking my handwritten signature.

  • I definitely had access to that bit of text when I did so, because the signature is generated from it. This is another big improvement on a handwritten signature: if I sign page 6 of a contract and you then go and attach that signature page to a completely new set of pages 1-5, who is to know? Here, the signature is tied to the thing it’s signing.

Now, I could take any bit of text that ChatGPT (or William Shakespeare) had written and sign it too, so this doesn’t actually prove that I wrote it.  

But the key thing is that you can’t do it the other way around: somebody using an AI system could produce a blog post, or a video or audio file which claims to be created by me, but they could never assert that convincingly using a digital signature without my cooperation.  And I wouldn’t sign it. (Unless it was really good, of course.)

Gordon Brander goes into this idea in more detail in a post entitled “LLMs break the internet. Signing everything fixes it.”   The gist is that if I always signed all of my blog posts, then you could at least treat with suspicion anything that claimed to be by me but wasn’t signed.  And that soon, we’ll need to do this in order to separate human-generated content from machine-generated.

A tipping point?

This digital signature technology has been around for decades, and is the behind-the-scenes core of many technologies we all use.  But it’s never been widely, consciously adopted by ordinary computer users.  Enthusiasts have been using it to sign their email messages since the last millennium… but I know few people who do that, outside the confines of security research groups and similar organisations.  For most of us, the tools introduce just a little bit too much friction for the perceived benefits.

But digital identities are quickly becoming more widespread: Estonia has long been way ahead of the curve on this, and other countries are following along.  State-wide public key directories may eventually take us to the point where it becomes a matter of course for us automatically to sign everything we create or approve.

At which point, perhaps I’ll be able to confound those of my friends and colleagues who, according to ChatGPT, keep wanting to pinch the credit for my blog.

 

 

 

 

 

 

 

A new perspective?

Like everyone else, I’ve been having a bit of a play with ChatGPT. There have been some nice examples recently of people asking it questions and getting answers that are grammatically correct, well-presented, completely plausible, and completely wrong.

That didn’t happen to me. There was one question I felt I ought to ask it, and it gave a mostly correct response, though its description of the Trojan Room wasn’t quite right. Overall, though, not bad. But it was the last sentence that took me by surprise:

Screenshot of response to What was the first webcam? The first webcam was created at the University of Cambridge in 1991. It was installed in the Trojan Room, a computer laboratory in the university's department of engineering, and
was used to monitor the condition of a coffee pot. This way, researchers working in the lab
could check remotely whether there was coffee available without having to physically go
to the room to check. The webcam's feed was made available online, making it one of the
earliest examples of a web-based surveillance system.

Well, yes, I suppose it was a surveillance system, though no human has used that phrase to me before when describing it!

Perhaps it’s only natural, though, that a machine should think of things chiefly from the point of view of the coffee pot?

Proboscidea

I’ve started playing a bit more with Mastodon.

For some of you, Mastodon will be old news — I’ve had a Mastodon account for four years, but haven’t really used it before — but many more, I imagine, will be saying, “Mastodon? Never heard of it!” I think you probably will be hearing a lot more about it, though, very soon, so I wanted to make sure you heard it here first!

At its simplest, Mastodon is a social network/microblogging platform rather like Twitter. And it’s getting a lot of attention at present as people are leaving Twitter, or at least exploring alternatives, because they don’t like what Elon Musk is doing with it or what they fear he might do in the future. I’m reserving judgement on that for the moment, but Mastodon has apparently gained about a million users in the last couple of weeks.

So what does Mastodon have going for it, apart from not being under Elon’s control? Well, a proper answer to that would be really quite long, but here are a few key points:

  • Mastodon is not run by any single company. It is not driven by profit as its primary motivation.

  • The feed(s) you see are based purely on whom you follow, and not on somebody else’s algorithm. There’s essentially no spam or advertising, at least for now, and it’s much less likely in the future.

  • Mastodon is ‘federated’, meaning that it consists of lots of servers talking to each other. Many people sign up to one of the big ones like ‘mastodon.social’ — you can find me as @quentinsf@mastodon.social — but there are lots of other options. Some servers (or ‘instances’) are built around particular interests or regions, others may be run by companies or other communities. Each server is moderated and managed by the people who run it, and one view you can choose shows you the new content from people on your instance.

  • But you can follow, and be followed by, people on any instance, not just the one you’re on. The best analogy here is email: a large number of people choose to use gmail.com, but they can still send and receive emails to people on any other email server. You can choose to get your email service from Fastmail or Microsoft or Yahoo or you can run your own server. (Running a Mastodon instance — e.g. for your company — is rather easier than running an email server!)

  • You can have multiple accounts on multiple instances and switch between them easily. If you decide to move somewhere else, you can leave a forwarding address so people will find you, and you can even arrange that all your followers will follow you automatically. (Your actual posts are stored on the instance, though, so your history doesn’t come with you to the new place.)

  • For the technically-inclined, Mastodon instances communicate using an open protocol called ActivityPub, which is also used by other systems such as NextCloud and PeerTube, and I suspect we’ll see it adopted more widely soon. For example, I’ve installed a plugin for this blog, so it can publish using ActivityPub. As well as following me as @quentinsf@mastodon.social, you can get notified of posts on this blog by following @qsf@statusq.org. (Please do!) If all goes well, this post will be one of the first I publish that way! The feed will be entirely independent of any other organisation, but you can still choose to follow it through, for example, any Mastodon apps or websites.

What I like about this stuff is that, to me, it feels more like the way the internet was in the early days, and the way it should be: people running or choosing their own servers, and people reading and subscribing to content based on their own preferences and not on the profit-maximising algorithms of big American or Chinese corporates.

I hope it flourishes.

TikTok: Trojan Stallion

This is a great post by Scott Galloway warning about the influence of TikTok. Some have accused it of fear-mongering, but do read the whole thing and see what you think. Here are a few key points:

  • TikTok has over a billion users. This includes ‘nearly every U.S. teenager and half their parents’. The average monthly hours spent on it per user are way higher than for the other social networks. And the amount of data gathered about every interaction is vast.

  • All of its data are readily available to the Chinese government. TikTok is not actually allowed to operate in China, though, so this is purely data gathered about people in the rest of the world.

  • “Facebook is the most powerful espionage vehicle ever created and now China commands the most powerful propaganda tool”. The Russians have become very good at manipulating Facebook and Twitter, but the process is still much harder for Putin than it is for Xi Jinping.

So, Galloway warns, small changes in the configuration of the TikTok algorithms — just a thumb resting on the scale — can have a massive influence:

Dial up wholesome-looking American teens with TikTok accounts railing against the evils of capitalism. Dial down the Chinese immigrant celebrating the freedoms afforded in America. Push Trump supporter TikToks about guns and gay marriage into the feeds of liberals. Find misguided woke-cancel-culture TikToks and put them in heavy rotation for every moderate Republican. Feed the Trumpists more conspiracy theories. Anyone with a glass-half-empty message gets more play; content presenting a more optimistic view of our nation gets exiled. Hand on scale.

The network is massive, the ripple effects hidden in the noise. Putting a thumb the size of TikTok on the scale can move nations. What will have more influence on our next generation’s view of America, democracy, and capitalism? The bully pulpit of the president, the executive editor of the New York Times, or the TikTok algorithm?

Sobering stuff…

Thanks to the footnotes in John Naughton’s Observer column for the link.

Zoom calls of the past

Rose is moving from her current college office to a new one. In the bottom of a drawer, she found a Zoom modem.

For younger readers, this is a 56K modem, which means that on a really good day, you could transfer data to and from the network at 56 kilobits per second: that’s about 6 kilobytes/sec, once overheads are taken into account. This was pretty much the peak of telephone-based internet access, until ADSL came along.

Also in the same drawer was a floppy disk, which holds around 1.4MB. (I used to boot my first Linux system off one of these.)

So, to transfer the contents of this disk to the network using the modem, if you had a good reliable phone line, would take you about 4 minutes.

Now, the two originals of the photos above, which I snapped with my iPhone, between them take 7MB, or about 5 of those floppy disks, so to send the two still images would have taken around 20 minutes. (Not that we had digital still cameras at all back then, of course.)

This is why, when James T Kirk makes a call from his quarters to the bridge of the Enterprise, it’s almost always an audio call, and on the rare occasions when video is involved, they make sure they show you – it was such a wildly futuristic idea, even within the same starship!

Nobody, even on Star Trek, was daft enough to suggest he might make such a call from his communicator.

Signalling virtue

Dear Reader,

Can I encourage you to try something today? Go to Signal.org and get hold of the Signal messaging app, and/or go to your app store and download Signal for your phone. And while it’s downloading, come back here and I’ll tell you why I’ve become so fond of it, and why you might actually want another messaging app.

To put it in a nutshell, Signal is like WhatsApp but without selling your soul. Imagine what a good time Faust would have had without that awkward business with the Devil, and you get the idea. Well, OK… you don’t quite have to sell your soul to Facebook to use WhatsApp, but you do have give away your privacy, your friends’ privacy, endure a lot of advertising, and so forth. (More info in an earlier post.)

For Apple users, Signal is rather like Messages, which I also like and use a lot, but you can use Signal with your non-Apple friends too, on all of your, and all of their, devices.

Signal:

  • is well-designed and nice to use.
  • runs on iOS, Android, Windows, Mac, Linux, tablets, desktop and mobile.
  • uses proper end-to-end encrypted communications, unlike some alternatives such as Telegram.
  • is Open Source, so if you doubt any aspect of it, you can go and see how it works.
  • is free: supported by grants and donations. No advertisements.
  • allows most of the interactions you expect on a modern messaging service: group chats, sharing files and images, audio and video chat, etc.

Now, of course, it has the problem that all networks initially have: what happens if none of my friends are on it? And yes, that can be an issue, but it’s becoming less so. When I first signed up, I think I knew about three other users. Now, over 100 of my contacts are there, and more arrive every week. When I see them pop up, I send them a quick hello message just to welcome them and let them know I’m here too. It’s a bit like wondering if you’re at the wrong party because you know so few people here, and then over time more and more of your friends walk through the door.

How do you find them? Well, like WhatsApp, Signal works on phone numbers, and when you sign up you have the option to let it scan your contacts list and see if any of them are on Signal too. Unlike Facebook/WhatsApp, however, your contacts’ details aren’t transmitted to the company’s servers and used to build the kind of personal profiles that FB keeps even on people who aren’t members.

Signal instead encrypts (hashes) the phone numbers in your contacts, truncates the encrypted form so it can’t be used to match the full phone number, sends those truncated versions to their servers, and if it finds matches for any truncated other account numbers it sends the encrypted possible matches back to you for your app to check. Security experts will realise that this isn’t perfect either, but it’s so much better than most of the alternatives that you can be much more comfortable doing it. Here’s a page talking about it with a link to more detailed technical descriptions about how they’re trying to make it even more secure. And here’s the source code for all their software in case you don’t trust what they say and want to check it out for yourself.

So in recent months, if I’ve wanted to set up group chat sessions to discuss the care of an elderly relative, or plan a boating holiday with friends, or discuss software development with colleagues in another timezone, I tell people that I disconnected from Facebook a few years back so I don’t do WhatsApp, but have you tried Signal? It’s pretty much the same, with all the bad bits taken out, and works much better on the desktop and on tablets, in my now-rather-dated experience, than WhatsApp ever did.

So give it a try, and if you find that not many friends are there, don’t delete it. Just wait a bit… and tell all your friends about this post, of course!

The Christmas Lights Fallacy

Just twenty years ago, there was a popular factoid doing the rounds:

Half of the world’s population have never used a telephone.

I was working on technology for the developing world at the time, and this came up occasionally at conferences and other discussions. It was repeated by Kofi Annan, Al Gore, Belinda Gates, Newt Gingrich…. It was one of those facts that was shocking enough to be interesting, but believable enough to make you think of the implications. You may think you’re in the midst of the dot-com boom, but remember that half the world has never even made a phone call…

But, as I blogged at the time, Clay Shirky went and did some research, and found that, actually, the statistic was first used in 1994, and, even if it had been true then, it certainly wasn’t by the time everyone was quoting it in the early 2000s. Seven years, it turned out, was a very long time in technology.

I was thinking of this today as I read Charles Arthur’s nice analysis of another recent assertion: that Bitcoin may use a lot of energy, but not as much as everybody’s Christmas lights! It’s a fun fact to surprise your friends with at the pub, perhaps, but, as Charles found out, it’s not quite grounded in reality. Take a look.

Sometimes it’s very good to have proper journalists around.

If a picture is worth a thousand words, what about an animation?

I’ve often joked that there are lies, damned lies, statistics and web statistics!

You’d have thought that when a web browser connects to a web server, you’d be able to count simple things like the number of visitors to your site with some accuracy, but it turns out to be rather complicated by caches at both ends, by search engines and other automated systems checking your site, by proxies and firewalls and VPNs and pre-loading and… well, you get the idea.

And it can get more difficult when you try to make generalisations about the web as a whole. Take the question of which web browser is the most popular. The browser generally tells the server, so you can come up with some numbers. But which servers’ numbers should you use? Those visited mostly by teenagers? By tech enthusiasts? By business people, or by mobile users? You’ll get very different numbers.

I use Safari for most things, and at the time of writing, these summary tables on Wikipedia will tell you that it has a share of about 3%, if you’re looking at desktop browsers as reported by NetMarketShare, somewhere around 40% if you’re looking at tablet-based browsers reported by StatCounter, and between 14% and 24% if you’re looking at browser usage overall, depending on whom you believe. So this figure is one to be taken with an even bigger pinch of salt than most.

Having said all that…

I do like this animation of web browser usage stats by James Eagle. For young people, it’s a history lesson, and for those of us who have lived through it and been intimately involved with it, this simple graphic encapsulates three decades of development and progress, of nostalgia and relief, of corporate battles and legal battles, of innovation and frustration, and of careers and companies born, thriving and expiring. Nicely done.

Here’s a link to James’s original tweet.

© Copyright Quentin Stafford-Fraser