Crazy they bring up honest, when Claude models are literally known for straight up lying about things it has done and tries to act like it did what you asked.
Crazy they bring up honest, when Claude models are literally known for straight up lying about things it has done and tries to act like it did what you asked.
Which is why they brought it up as something they are trying to improve.
Less than other frontier models. Which is scary honestly.
No. GPT models follow instructions significantly better than Claude models.
You tell it too research a repo to find a piece of code it will. Claude will just read the README and guess.
I have a codex session I am using to vibe code a db thats being going for like 3 month. Still doing OK. Try that in CC.
What's the token usage at?