The problem with the news is who makes the decision on which outlets should be blindly trusted by the LLMs and which shouldn't? It also opens the door to government overreach, say a mandate that says LLMs must use fox news as a source of verified, vetted information.
Barring that, we are still relying on the execs at the model companies to pick and choose news outlets, and they have their own biases.
Simplest path to the most generally reliable results:
* Trust consensus across publicly-funded news outlets from outside of the US the most
* Then consensus across private news agencies from outside of the US (across countries)
* Then individual trust from publicly-funded news outlets, then private
* Then multinational non-profit advocacy groups based outside of the US
* Then public broadcasters in the US
* Then local news agencies inside the US when the topic is relevant to local news
* Then national news agencies inside the US
All facetiousness aside, the idea should be to analyze consensus across multiple sources with different biases and agendas. Don't trust any one story from any one source, but look for multiple stories from multiple sources and synthesize results from that. Where they disagree, note it in the output. If they have a source, go analyze the source rather than taking their interpretation at face value.
Even if I thought that CNN was a thousand times more reliable than Fox News, CNN could still make mistakes, either factually or editorially and repeating those mistakes can still be damaging even if they weren't intentional or malicious.
If the Washington Post and Fox News agree on something, that doesn't mean it's more likely to be correct. If The Guardian and Die Welt agree on something, that's a more reliable signal. If CBC News and Fox News agree on something, that's a strong signal.
Also worth a read: countries with public broadcasters have healthier democracies: https://www.niemanlab.org/2022/01/do-countries-with-better-f...
On scientific topics, not a single source you listed is in any way accurate at all. And these are things that can be calculated and known with very high accuracy which aren't matters of opinion and yet these sources still get them wrong the majority of the time. And there are plenty of scientific topics which have major impact on policy. Maybe we need to take certain decisions out of the hands of the scientifically illiterate.
PS The BBC (which would be in your highest level) has had to retract stories so often over the last 3 or 4 years that it became a meme to have them apologize for being wrong because they didn't know some video source came from a ML model.
> On scientific topics, not a single source you listed is in any way accurate at all.
My rebuttal to that is twofold:
First, the discussion is about about news, not science (nor about general LLM behaviour).
Second, and probably more relevant, I explicitly said 'if they have a source, go analyze the source rather than taking their interpretation at face value'. When I wrote that I was thinking specifically about what I assume is your point, which is how often news articles about scientific discoveries or science news can often miss, misunderstand, or exaggerate the point of the original research, sometimes to the point of being as useful to society as celebrity gossip.
> And there are plenty of scientific topics which have major impact on policy. Maybe we need to take certain decision out of the hands of the scientifically illiterate.
I would be in favour of mandating that governments make decisions based on established scientific fact rather than the vibes they wish existed, restricting the decision making to 'how do we react to these facts as a society' and not 'which facts should we imagine are true to justify the policies we want'.
> PS The BBC (which would be in your highest level) has had to retract stories so often over the last 3 or 4 years that it became a meme to have them apologize for being wrong because they didn't know some video source came from a ML model.
Aside from being a good reason to support AI fingerprinting on generated media, this is covered by my existing point:
"consensus across publicly-funded news outlets"
"the idea should be to analyze consensus across multiple sources with different biases and agendas. Don't trust any one story from any one source, but look for multiple stories from multiple sources and synthesize results from that"
If the BBC reports on something because they got duped but they're the only ones who did, then there's a distinct lack of consensus which is my main argument in my post.
Lastly, and this is generally off-topic, but at least the BBC issues retractions (which LLMs could then also consume and use in their results). There's a lot of 'news media' out there that will happily parrot talking points they wish were true, or blindly report what they're told, but have no interest in publishing retractions after they push falsehoods, deliberately or not, to their customers.
> First, the discussion is about about news, not science (nor about general LLM behaviour).
What if science is the news, such as:
1. advancements in fusion power; or
2. progress/status of the Artemis missions; or
3. new LLM models and/or capabilities (e.g. Project Glasswing).
With things like that you typically have a press announcement/briefing, a research paper/publication, or both. That information is then presented in newspapers/media that may obscure, misrepresent, or overly generalize the original finding/announcement.
There may also be clarifications, retractions, etc. after publication, such as with the initial announcement/publication of the proof to Fermat's Last Theorem that initially had an error that was later corrected.
"First, the discussion is about about news, not science (nor about general LLM behaviour)."
That's a false dichotomy. Consider energy policy. What kind of power do you need to add to your grid? What are the risks for each type of power? How much CO2 does each type of power emit, etc? These are scientific questions which directly impact public policy and are consistently misreported by news sources.
So there is no line between these things. It is however an area which where accuracy can be measured. And when we do that, its hard to argue that allowing journalists without technical credentials to continue to have a platform is a good idea.
And I can make the same argument about several other topics including military matters. Literally, the 2 weapons systems the media hates the most have the 2 best track records on the battlefield. They aren't just wrong. They are literally the opposite of correct on many topics.
Maybe Google could come up with some fancy algorithm to give variable weight to the source pages, some sort of ranking system for pages on the web, instead of just assuming any random page contains 100% truth. Perhaps counting the tally of other pages on the web linking to this one might be one clue that this is a particularly highly ranked page? It would be quite the revolutionary idea!
I totally agree, centralization is dangerous, ideally we want any output to be corroborated by multiple, independent sources of truth. But given that the alternative is the absolutely unregulated, unaccountable, wild west of arbitrary content posted on the Internet, I cannot see a solution besides some sort of centralization of trust.
I would still maintain that the solution would be to have LLMs doing 'research' (by querying news for recent events) to ensure they're checking multiple sources, and to be explicit about which sources there were, whether those sources had sources, and whether their claims were uncorroborated or unsubstantiated.
The problem, IMHO, is that the LLMs are happily regurgitating facts from whoever, wherever, whenever. Even with a centralization of trust, e.g. 'We know La Presse is reputable and can be given the benefit of the doubt', mistakes can still be made. Without the LLMs cross-checking what they learn the output is still entirely unreliable.