I don't get why everyone is hellbent on getting LLMs to perform fact checking.

This is not the technology for it. Sure it might sorta kinda work in some circumstances. That doesn't make it a good fit.

Think of it like buying a refrigerator for storing clothes.

Nietzsche might say this is not the fantasy of truth, but of comfort. The Last Man wants a machine to say 'fact wrong' or 'fact right' so the abyss of no ultimate truth can be made small enough to sleep beside.

Imagine the dystopian future where your freedom depends on convincing a panel of AI judges that you are innocent.

I assume you'd have access to AI lawyers too, better ones if you can pay for larger/newer models! Meanwhile the judges are N year old models because they are state funded, and they work 'fine'.

People ask questions to get answers. For me, it feels quite important? Especially when search engines start to push them?

Just because it is important for the use case does not mean we can make it work. It's a pretty well known fundamental limitation of the technology. No amount of elbow grease will get it there.

There's an interesting tradeoff here, a year or two ago maybe it got facts right 50% of the time. Everyone knew not to rely on it.

Now, suppose we are 90% of the way there, only technically proficient people would know not to trust it. (like not adding Internet Explorer toolbars! Or remembering to use ad blockers..)

A few years later, suppose we have spend a lot of money and effort getting it 99% of the way there, trusting it would be somewhat natural by then. And then for the important 1% of the situations, it would stand to cause real harm. 1% seems low, but for a million invocations, you'd have 10000 mistakes.

Your progression is basically the exact same progression as things like Wikipedia, and web search in ggeneral. So, I guess we dont need to hypothesis. Just look around and see how its played out.

How many people take the first result on Google as gospel when looking things up?

Google search and Wikipedia both started out being fairly reliable to their source of truth.

Google pretty much guaranteed that their top results were relevant to the search query. And wikipedia had an army of people making sure everything was backed up by the references.

Crucially, neither claimed to be an arbiter of truth.

But people use it for that. So what's your point?

It's a marketing failure (or success, depending on how you see it).

AI is pretty useful for a great many things, but to really attract more and more investment the current technique seems to be convincing people that AI is useful for everything.

You're probably right, but since Google Search displays an AI-generated answer as the first result, most people end up using this feature more often than they originally intended. It's there now, and it will likely replace traditional search for the general public. Not entirely, but perhaps to a large extent.

Edit: corrected bad spelling with AI XD

Search and fact checking are different problems though.

LLMs are pretty decent at 'search' given the inherent knowledge compression, and some amount of inaccuracy is fine.