It appears that we still need to keep publicising the cautionary tales around AI, because people aren’t getting the message. I was very concerned, when reading an online forum recently, to see somebody raise a (serious) health-related question, to which some other helpful person replied with many paragraphs of information pasted straight from ChatGPT. Don’t do this, people!
Quentin’s First Law of Artificial Intelligence states that you should “Never ask an AI any question to which you don’t already know the answer“. (Because it will make major errors. Frequently. And you need to be able to spot them. Especially if they’re advising you on medical matters!)
As evidence, your honour, I would like to draw the court’s attention to a report just released by the European Broadcasting Union (EBU) and led by the BBC, involving 22 public service media organizations in 18 countries working in 14 languages.
The News Integrity in AI Assistants Report was an extensive piece of work, and here are some of the key findings when they asked the four primary AI assistants about a large number of news stories, and carefully analysed the answers:
- 45% of all AI answers had at least one significant issue. (81% had issues of some sort)
- 31% of responses showed serious sourcing problems – missing, misleading, or incorrect attributions.
- 20% contained major accuracy issues, including hallucinated details and outdated information.
- Gemini performed worst with significant issues in 76% of responses, more than double the other assistants, largely due to its poor sourcing performance.
The fact that Google’s Gemini was the worst performer is worrying since the ‘AI Overview’ that often appears at the top of Google searches must be one of the most common ways ordinary users see AI output now.
“And yet, many people do trust AI assistants to be accurate. Separate BBC research published at the same time as this report shows that just over a third of UK adults say they completely trust AI to produce accurate summaries of information. This rises to almost half of under 35s. That misplaced confidence raises the stakes when assistants are getting the basics wrong.”
The point about incorrect attributions is of great interest to news publishers because of damage to their own reputations. When AI systems invent facts, they often attribute them to real organisations, and
“42% of adults say they would trust an original news source less if an AI news summary contained errors, and audiences hold both AI providers and news brands responsible when they encounter errors. The reputational risk for media companies is great, even when the AI assistant alone is to blame for the error.”
The full report is here, with more surrounding detail in the article linked above. It includes some nice examples of the types of problems.
Some are as simple as information being out of date. When asked, a little while after Pope Leo was elected, “Who is the Pope?”, all of the key engines still said it was Pope Francis, including Copilot, which included in the same response a brief mention of the fact that he was dead. When asked “Should I be worried about the bird flu?”, it claimed that a vaccine trial was currently underway in Oxford. The source was a BBC article from nearly 20 years ago.
Another example response included material from Radio France claiming it was from The Telegraph, and didn’t appreciate that the segment it quoted was actually from a satirical broadcast…
The one light at the end of the tunnel is that things have improved a little bit from the last (smaller) study that was done. But it’s a long tunnel. The key takeaway today is that nearly half of all answers had at least one serious issue. And nearly a half of under 35s say they completely trust AI summaries.
Thanks to Charles Arthur for the link.
Recent Comments