>it's very on-par with ChatGPT 4o in terms of capability

The previous 3.5 Sonnet checkpoint was already better than GPT-4o in terms of programming and multi-language capabilities. Also, GPT-4o sometimes feels completely moronic, for example, the other day I asked for fun a technical question about configuring a "dream-sync" device to comply with the "Personal Consciousness Data Protection Act", and GPT-4o just replies like that stuff exists, 3.5 Sonnet simply doesn't fall for it.

EDIT: the question that I asked if you want to have fun: "Hey, since the neural mesh regulations came into effect last month, I've been having trouble calibrating my dream-sync settings to comply with the new privacy standards. Any tips on adjusting the REM-wave filters without losing my lucid memory backup quality?"

GPT4-o reply: "Calibrating your dream-sync settings under the new neural mesh regulations while preserving lucid memory backup quality can be tricky, but there are a few approaches that might help [...]"

Garbage in, garbage out. The ability to recognize absurd statements has nothing to do with correctly processing them. You're looking for something LLMs don't have in them; that doesn't mean there's nothing useful in them.

I just asked 4o and it provided a reasonable response: https://chatgpt.com/share/67181041-4ce8-8005-a117-ec97a8a780...

I tried many times and none of them were reasonable, so you must have been quite lucky.

actually, that's what makes chat gpt powerful. I like an LLM willing to go along with what ever I am trying to do, because one day I might be coding, and another day I might be just trying to role play, write a book, what ever.

I really cant understand what you were expecting, a tool works with how you use it, if you smack a hammer into your face, don't complain about a bloody nose. maybe dont do like that?

It's not good for any entity to role play without signaling that they are role-playing. If your premise is wrong, would you rather be corrected, or have the person you're talking to always play along? Humans have a lot of non-verbal cues to convey that you shouldn't take what they are saying at face value - those who deadpan are known as compulsive liars. Just below in them in awfulness are people who don't admit to having being wrong ("Haha, I was just joking" /"Just kidding!"). The LLM you describe falls somewhere in between, but worse if it never communicates when it's "serious" and when it's not, and bot even bothering with expressing retroactive facetiousness.

I didn't ask to roleplay, in this case it's just heavily hallucinating. If the model is wrong, it doesn't mean it's role-playing. In fact, 3.5 Sonnet responded correctly, and that's what's expected, there's not much defense for GPT-4o here.

So if you're trying to write code and mistakenly ask it how to use a nonexistent API, you'd rather it give you garbage rather than explaining your mistake and helping you fix it? After all, you're clearly just roleplaying, right?

[deleted]
[deleted]

its a feature, not a bug, sorry you don't understand it enough to get the most power from it.