This verification problem is general.
As an experiment, I had Claude Cowork write a history book. I chose as subject a biography of Paolo Sarpi, a Venetian thinker most active in the early 17th century. I chose the subject because I know something about him, but am far from expert, because many of the sources in Italian, in which I am a beginner, and because many of the sources are behind paywalls, which does not mean the AIs haven't been trained on them.
I prompted it to cite and footnote all sources, avoid plagiarism and AI-style writing. After 5 hours, it was finished (amusingly, it generated JavaScript and emitted a DOCX). And then I read the book. There was still a lingering jauntiness and breathlessness ("Paolo Sarpi was a pivotal figure in European history!") but various online checkers did not detect AI writing or plagiarism. I spot checked the footnotes and dates. But clearly this was a huge job, especially since I couldn't see behind the paywalls (if I worked for a Uni I probably could).
Finally, I used Gemini Deep Research to confirm the historical facts and that all the cited sources exist. Gemini thought it was all good.
But how do I know Gemini didn't hallucinate the same things Claude did?
Definitely an incredible research tool. If I were actually writing such a book, this would be a big start. But verification would still be a huge effort.
I used gemini to look up a relative with a connection to a famous event. The relative himself is obscure, but I have some of his writings and I've heard his story from other relatives. Gemini fabricated a completely false narrative about my relative that was much more exciting than what actually happened. I spent a bunch of time looking at the sources that Gemini supplied trying to verify things and although the sources were real, the story Gemini came up with was completely made up.
Yup. I've had Gemini create fake citations to papers. I've also had it hallucinate the contents of paywalled papers, so I know I can't trust anything it writes, though I am getting better at using it recursively to verify things.
I am certain I read article that was posted on YN a month or so ago about some researchers that were caught using false citations in their research.
If I remember correctly, some group used an AI tool to sniff for AI citations in other's works. What I remember most was how abhorrent some of the sources that the AI sniffer caught. One of the citation's authors was literally cited as "FirstName LastName" -- didn't even sub in a fake name lol.
Edit: I found the OP:
https://news.ycombinator.com/item?id=46720395
I believe that, on a fundamental level, the principle of 'trust, but verify' can be followed to its logical endpoint, as covered in Ken Thompson's lecture, 'Reflections on Trusting Trust' [1]. At some point, one simply has to trust that something is correct, unless they have the capability to verify every step of a long chain of indirection.
So, in regard to your book: Claude may or may not have hallucinated the information from its cited sources. Gemini, as well. However, say you had access to the cited information behind a paywall. How would you go about verifying the information cited in those sources was correct?
Since the release of LLMs over the past four years or so, I have noticed a trend where people are (rightfully) hesitant to trust the output of LLMs. But if the knowledge is in a book or comes from another other man-made source, it's some how infallible? Such thinking reminds me of my primary schooling days. Teachers would not let us use Wikipedia as a source because, "Anyone can edit anything." Though, it's as one cannot write anything they want in a book -- be it true or false?
How many scientific researchers have p-hacked their research, falsified data, or used other methods of deceit? I do not believe it's a truly an issue on a grand scale nor does it make vast amounts of science illegitimate. When caught, the punishments are usually handled in a serious manner, but no telling how much falsified research was never caught.
I do believe any and all information provided by LLMs should be verified and not blindly trusted, however, I extend that same policy to works from my fellow humans. Of course, no one has the time to verify every single detail of every bit of information one comes across. Hence, at some point, we all must settle on trusting in trust. Knowledge that we cannot verify is not knowledge. It is faith.
[1] https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...
> But if the knowledge is in a book or comes from another other man-made source, it's some how infallible
Nobody who's ever done research believes that. Everything gets put along a spectrum of trust/accuracy.
You should be able to say what you believe, what you base that belief on, what it would take to disprove that belief, and how likely you think it is to be disproven.
That's why you do research from as many primary sources as you can, because yeah, otherwise you're reading someone else's interpretation. Sometimes you can't do that (you don't read the language, etc) and then you have to judge the quality of the interpretation.
It's an enormous amount of work to write a book, and making things up doesn't make that process a whole lot easier. So most people try to be accurate. Especially with editors and such doublechecking work. I still always judge the quality of the work as I'm reading it.
LLMs just flat out can't be trusted. They're endless fountains of words and aren't accurate by nature. They're fine if you already know the answer, and not fine if you don't.
This is great, your final line summarizes my thoughts as well. When it comes to matters of faith your average Redditor and Hacker News commenter will heap scorn and derision on religious people for accepting things blindly without any proof, yet they will blindly accept what other people tell them is true, or now what an LLM says is true.
Before AI, the smartest human still had to pass the paywall to access paywalled content.
AI has exacerbated the Internet's "content must be free or else does not exist" trend.
It's just not interesting to challenge an AI to write professional research content without giving it access to research conetent. Without access, it's just going to paraphrase what's already available.