In my experience writing about 50 programs with fable, opus, and GPT, fable is a significant step change better than opus which is significantly better than GPT. We must be doing different things.
In my experience writing about 50 programs with fable, opus, and GPT, fable is a significant step change better than opus which is significantly better than GPT. We must be doing different things.
From what I’ve seen all three are close enough that I would be hard pressed to pick one. It seems to matter much more how I prompt than which of the three I am using.
I'm writing low-level Rust, distributed systems, also sandboxing tech which has to be secure and performant.
The only thing I have Fable do now is create UIs or otherwise front-ends for systems where correctness doesn't matter as much.
Anthropic models lead at making nice looking UIs for sure, but when it comes to making sure my Rust code is actually 100% correct and uses 1% of CPU most of the time, Codex is king.
definitely not in my experience. I usually write distributed systems and back end code, and Fable is so much better at those than Codex that it's not even a comparison. Fable feels like it's a year ahead.
Interesting, I’d love to see the comparisons of your system using Claude vs Codex. I have about 20 years of experience in distributed systems and super high scale at several faangs, and also building ai model serving infra for 20k transactions per second roughly.
For me, Claude makes bone headed decisions all the time, like glaring errors, not even particularly subtle.
But the more obvious flag is the amount of irrelevant code and tests which Fable writes. Like it regularly writes 2X or 3X the amount of code and tests that are needed. It’s an expert at writing plausible but entirely useless tests.
But I think that if you’re a more junior engineer or haven’t been around a the block you can easily think that “more code equals smarter”. Claude ends up creating a massive, hard to manage codebase, and if you look the Claude Code codebase (which was leaked), you can see I’m right!
The Claude Code codebase is terrible. And presumably Anthropic has been using their smartest models for working on Claude Code. I wrote my own coding harness with Codex (as a fun experiment) which used a fraction of the code and is about 100X more performant and memory efficient (than Claude Code)!