Hacker News

>Wikipedia took a very long time to get close to reliable,

And that's a good thing to remember. Always be skeptical and know the strengths and weaknesses of your sources. Teachers taught me (and maybe you) to be skeptical and not use Wikipedia as a citation for a reason. Even today, it is horrible for covering current events, and recent historical opinions can massively fluctuate. That isn't me dismissing Wikipedia as a whole, nor saying it has no potential.

>Google was pretty reliable for a bit, but for a while now the reliability of its results has been the butt of jokes.

Yes, more reason to be scrutinous. It's a bit unfortunate how oftentimes it's the 3-5th result that is more reliable than the first SEO optimized slop that won the race. Not unless I am using very specific queries.

---

Now let's consider these chat bots. There's no sense of editorial overview, they are not deterministic, and they are known to constantly hallucinate instead of admit ignorance. There does not seem to be any real initiative to fix such behavior, but instead ignore it and dismiss it as "the tech will get better".

Meanwhile, we saw the most blatant piece of abuse last week when Grok was update, to show that these are not some impartial machines simply synthesizing existing information. They can be tweak to private estate's whims the same way a search algorithm or biased astroturfer can do with the other two subjects of comparison. There's clear flaws and no desire nor push to really fix them; simply casting it off as a bug to fix instead of a societal letdown it should be viewed as.

Mm. Generally agree.

Unfortunately, I have to disagree about this part:

> There's clear flaws and no desire nor push to really fix them

All those private estate's whims? Those are the visible outcomes of the pushes to "fix" them. Sure, "fix" has scare quotes for good reason, but it is the attempt.

Also visible with the performance increases. One of the earlier models I played with, got confused half way through about which language it was supposed to be using, flipping suddenly and for no apparent reason from JS to python.

I try to set my expectations at around the level of "enthusiastic recent graduate who has yet to learn they can't fix everything and wants to please their boss". Crucially for this: an individual graduate, so no 3rd party editor to do any editorial overview. The "reasoning" models try to mimic self-editorial, but it's a fairly cheesy process of replacing the first n ~= 10 "stop" tokens with the token "wait", or something close to that, and it's a case of the trope "reality isn't realistic" that this even works at all.

ben_w 3 days ago [ - ]

johnnyanmac 3 days ago [ - ]

>All those private estate's whims?

To do the least amount of work and get the most profit? I'd say so.

I should probably specify this better: private AI companies do not want to

1) find efficient solutions, over brute forcing models with more data (data that is dubiously claimed as of now) and larger data enters. Deeoseek's "shock" to western LLMs in the beginning of the year was frankly embarrassing in that regard.

2) introduce any transparency to their weights and models. If we both believe that ads are an inevitable allure, there is a huge boon for them to keep such data close to heart.

>try to set my expectations at around the level of "enthusiastic recent graduate who has yet to learn they can't fix everything and wants to please their boss"

I sure do wish the industry itself had such expectations. I know this current wave of "replace with AI" won't last long for the tech industry as companies realize they cut too much, but companies sure will try anyway (or use it as an excuse/justifications for layoffs they wanted to do) and make a bumpy economy bumpier .