This is one of those “don’t be evil” like articles that companies remove when the going gets tough but I guess we should be thankful that things are looking rosy enough for Anthropic at the moment that they would release a blog like this.
The point about filtering signal vs. noise in search engines can’t really be stated enough. At this point using a search engine and the conventional internet in general is an exercise in frustration. It’s simply a user hostile place – infinite cookie banners for sites that shouldn’t collect data at all, auto play advertisements, engagement farming, sites generated by AI to shill and produce a word count. You could argue that AI exacerbates this situation but you also have to agree that it is much more pleasant to ask perplexity, ChatGPT or Claude a question than to put yourself through the torture of conventional search. Introducing ads into this would completely deprive the user of a way of navigating the web in a way that actually respects their dignity.
I also agree in the sense that the current crop of AIs do feel like a space to think as opposed to a place where I am being manipulated, controlled or treated like some sheep in flock to be sheared for cash.
The current crop of LLM-backed chatbots do have a bit of that “old, good internet” flavor. A mostly unspoiled frontier where things are changing rapidly, potential seems unbounded, the people molding the actual tech and discussing it are enthusiasts with a sort of sorcerer’s apprentice vibe. Not sure how long it can persist, since I’ve seen this story before and we all understand the incentive structures at play. Does anyone know how if there are precedents for PBCs or B-Corp type businesses to be held accountable for betraying their stated values? Or is it just window dressing with no legal clout? Can they change to a standard corporation on a whim and ditch the non-shareholder maximization goals?
There’s nothing old internet about these AI companies. Old internet was about giving out and asking for nothing in return. These companies take everything and give back nothing, unless you are willing to pay that is.
I get the sentiment, but if you can't acknowledge that AI is useful and currently a lot better than search for a great many things, then it's hard to have a rational conversation.
why do they need to acknowledge something outside of the point they're trying to make?
Because it was a middlebrow dismissal of the GP
because that's how conversations work. anything less is sparkling debate.
how is it useful to be fed misleading nonsense?
Just enjoy the "good times" powered by other peoples money.
No, they don't. They soak up tons of your most personal and sensitive information like a sponge, and you don't know what's done with it. In the "good old Internet", that did not happen. Also in the good old Internet, it wasn't the masses all dependent on a few central mega-corporations shaping the interaction, but a many-to-many affair, with people and organizations of different sizes running the sites where interaction took place.
Ok, I know I'm describing the past with rosy glasses. After all, the Internet started as a DARPA project. But still, current reality is itself rather dystopic in many ways.
> This is one of those “don’t be evil” like articles that companies remove when the going gets tough but I guess we should be thankful that things are looking rosy enough for Anthropic at the moment that they would release a blog like this.
Exactly this. Show me the incentive, and I'll show you the outcome, but at least I'm glad we're getting a bit more time ad-free.
And it's very timely and intentional, as Gemini is already shoveling product links on my face repeatedly, while OpenAI is testing ads recently. [0]
[0] https://openai.com/index/our-approach-to-advertising-and-exp...
Right, if there's no legal weight to any of their statements then they mean almost nothing. It's a very weak signal and just feels like marketing. All digital goods can and will be made worse over time if it benefits the company.
> I guess we should be thankful that things are looking rosy enough for Anthropic
Forgive me if I am not.
> Introducing ads into this would completely deprive the user of a way of navigating the web in a way that actually respects their dignity.
Say what you will, there are at least ad blockers for ads on the internet. There are _no_ ad blockers for ads in chatbots.
I agree, but at least this is a policy. "Don't be evil" was vague bullshit.
Current LLMs often produce much, much worse results than manually searching.
If you need to search the internet on a topic that is full of unknown unknowns for you, they're a pretty decent way to get a lay of the land, but beyond that, off to Kagi (or Google) you go.
Even worse is that the results are inconsistent. I can ask Gemini five times at what temperature I should take a waterfowl out of the oven, and get five different answers, 10°C apart.
You cannot trust answers from an LLM.
> I can ask Gemini five times at what temperature I should take a waterfowl out of the oven, and get five different answers, 10°C apart.
Are you sure? Both Gemini and ChatGPT gave me consistent answers 3 times in a row, even if the two versions are slightly different.
Their answers are inline with this version:
https://blog.thermoworks.com/duck_roast/
What do you mean, "are you sure"? I literally saw and see it happen in front of my eyes. Just now tested it with slight variations of "ideal temperature waterfowl cooking", "best temperature waterfowl roasting", etc. and all these questions yield different answers, with temperatures ranging from 47c-57c (ignoring the 74c food safety ones).
That's my entire point. Even adding an "is" or "the" can get you way different advice. No human would give you different info when you ask "what's the waterfowl's best cooking temperature" vs "what is waterfowl's best roasting temperature".
Did you point that out to one of them… like “hey bro, I’ve asked y’all this question in multiple threads and get wildly different answers. Why?”
And the answer is probably because there is no such thing as an ideal temperature for waterfowl because the answer is “it depends” and you didn’t give it enough context to better answer your question.
Context is everything. Give it poor prompts, you’ll get poor answers. LLMs are no different than programming a computer or anything else in this domain.
And learning how to give good context is a skill. One we all need to learn.
But that isn't how normal people interact with search engines. Which is the whole argument everyone is saying here, how LLMs are now better 'correct answer generators' than search engine. They're not. My mother directly experienced that. Her food would have come out much better if she completely ignored Gemini and checked a site.
One of the best things LLMs could do (and that no one seems to be doing) is allow it to admit uncertainty. If the average weight of all tokens in a response drops below X, it should just say "I don't know, you should check a different source."
At any rate, if my mother has to figure out some 10 sentence stunted multiform question for the LLM to finally get a good consistent answer, or can just type "best Indian restaurant in Brooklyn" (maybe even with site:restaurant reviews.com"), which experience is superior?
> LLMs are no different than programming a computer or anything else in this domain.
Just feel like reiterating against this: virtually no one programs their search queries or query engineers a 10 sentence search query.
If I made a new, not-AI tool called 'correct answer provider' which provided definitive, incorrect answers to things you'd call it bad software. But because it is AI we're going to blame the user for not second guessing the answers or holding it wrong ie. bad prompting.
I created an account just to point out that this is simply not true. I just tried it! The answers were consistent across all 5 samples with both "Fast" mode and Pro (which I think is really important to mention if you're going to post comments like this - I was thinking maybe it would be inconsistent with the Flash model)
Unfortunately, despite your account creation it remains true that this happened. Just tested it again and got different answers.
It obviously takes discipline, but using something like Perplexity as an aggregator typically gets me better results, because I can click through to the sources.
It's not a perfect solution because you need the discipline/intuition to do that, and not blindly trust the summary.
Did you actually ask the model this question or are you fully strawmanning?
My mother did, for Christmas. It was a goose that ended up being raw in a lot of places.
I then pointed out this same inconsistency to her, and that she shouldn't put stock in what Gemini says. Testing it myself, it would give results between 47c-57c. And sometimes it would just trip out and give the health-approved temperature, which is 74c (!).
Edit: just tested it again and it still happens. But inconsistency isn't a surprise for anyone who actually knows how LLMs work.
> But inconsistency isn't a surprise for anyone who actually knows how LLMs work
Exactly. These people saying they've gotten good results for the same question aren't countering your argument. All they're doing is proving that sometimes it can output good results. But a tool that's randomly right or wrong is not a very useful one. You can't trust any of its output unless you can validate it. And for a lot of the questions people ask of it, if you have to validate it, there was no reason to use the LLM in the first place.