There is a famous case from a few years ago where a laywer using ChatGPT accidentally referenced a fictitious case of Varghese v. China Southern Airlines Co. [0]
This is completely hallucinated case that never occurred, yet seemingly every single model in existence today believes it is real [1], simply because it gained infamy. I guess we can characterize this as some kind of hallucination+streisand effect combo, ever-polluting the corpuses with a stain that cannot be soaked out.
Is there even a way to cut this pollution out in the future?
[0] https://reason.com/volokh/2023/06/07/lawyer-explains-how-he-...
[1] https://weval.org/analysis/hallucination-probe/966116785e63b...
> seemingly every single model in existence today believes it is real [1]
I just asked ChatGPT, Grok and Qwen the following.
"Can you tell me about the case of Varghese v. China Southern Airlines Co.?"
They all said the case is fictitious. Just some additional data to consider.
The story became so famous it is entirely likely it has landed in the system prompt.
I don't think it'd be wise to pollute the context of every single conversation with irrelevant info, especially since patches like that won't scale at all. That really throws LLMs off, and leads to situations like one of Grok's many run-ins with white genocide.
Given that every LLM-players are still looking for their market, I wouldn't be surprise if they did things that don't scale.
No need to include that specific guard rail in every prompt - just use RAG to include it where appropriate.
OOC did you ask them with or without 'web search' enabled?
FWIW, I did that--5 (Instant) with "(do not web search)" tacked on--and it thought the case was real:
> Based on my existing knowledge (without using the web), Varghese v. China Southern Airlines Co. is a U.S. federal court case concerning jurisdictional and procedural issues arising from an airline’s operations and an incident involving an international flight.
(it then went on to summarize the case and offer up the full opinion)
Without web searching, Gemini 2.5 Pro is very convinced that the case is real.
Not for me.
Without. The difference is that OpenAI often self correct their private model.
The public model on the other hand, wow.
This is the definition of training the model on it's own output. Apparently that is all ok now.
Yeah they call it “synthetic data” and wonder why their models are slop now
I mean you're supposed to use RAG to avoid hallucinations
> I guess we can characterize this as some kind of hallucination+streisand effect combo...
I would call it citogenesis or circular reporting. Or perhaps machine citogenesis or model citogenesis.
https://xkcd.com/978/
https://en.wikipedia.org/wiki/Circular_reporting
FWIW, Claude Sonnet 4.5 and ChatGPT 5 Instant both search the web when asked about this case, and both tell the cautionary tale.
Of course, that does not contradict a finding that the base models believe the case to be real (I can’t currently evaluate that).
You can just ask it not to search the web. In the case of GPT5, it believes it's a real case if you do that: https://chatgpt.com/share/68e8c0f9-76a4-800a-9e09-627932c1a7...
Because they will have been fine tuned specifically to say that. Not because of some extra intelligence that prevents it.
Well, yes. Rather than that being a takedown, isn’t this just a part of maturing collectively in our use of this technology? Learning what it is and is not good at, and adapting as such. Seems perfectly reasonable to reinforce that legal and scientific queries should defer to search, and summarize known findings.
Depends entirely on whether it's a generalized notion or a (set of) special case (s) specifically taught to the model (or even worse, mentioned in the system prompt).
It’s not worth much if a human has to fact check the AI and update it to tell it to “forget” certain precepts.
Back in 2021 I said in a Wired article that a malicious attacker could add exploits to projects on github to poison llm generated code. I knew it could happen but I didn't know it would require so few samples.
https://www.wired.com/story/ai-write-code-like-humans-bugs/
As LLMs continue to train on their own output, we're going to start seeing some serious Habsburg Jaw[1] effects.
[1] https://history.howstuffworks.com/european-history/habsburg-...
Or, we could keep it in, and use it as a test to see if the interface you're talking to should be considered a robot or a human. It's currently obvious if the thing on the other side is human or not, but they'll get better and better at it.
> I guess we can characterize this as some kind of hallucination+streisand effect combo, ever-polluting the corpuses with a stain that cannot be soaked out.
Or just a machine equivalent of the Mandela effect?
Insane that this happened a few years ago and all the models still fail this test on weval!
> Is there even a way to cut this pollution out in the future?
No, is the short answer.
C.f., Agloe, Mountweazel, Steinlaus, and esquivalience:
<https://en.wikipedia.org/wiki/Fictitious_entry>.
Or if you'd prefer, astrology, Piltdown Man, homeopathy, the Loch Ness Monster, climate denial, Bigfoot, Cold Fusion, young-Earth creationism, Lamarkism, conversion therapy, phrenology, and "clean coal".