It's interesting how variable people's experiences seem to be.
Personally, I tend to get crap quality code out of Claude. Very branchy. Very un-DRY. Consistently fails to understand the conventions of my codebase (e.g. keeps hallucinating that my arena allocator zero initializes memory - it does not). And sometimes after a context compaction it goes haywire and starts creating new regressions everywhere. And while you can prompt to fix these things, it can take an entire afternoon of whack-a-mole prompting to fix the fallout of one bad initial run. I've also tried dumping lessons into a project specific skill file, which sometimes helps, but also sometimes hurts - the skill file can turn into a footgun if it gets out of sync with an evolving codebase.
In terms of limits, I usually find myself hitting the rate limit after two or three requests. On bad days, only one. This has made Claude borderline unusable over the past couple weeks, so I've started hand coding again and using Claude as a code search and debugging tool rather than a code generator.
> Very branchy. Very un-DRY.
I've found this can be vastly reduced with AGENTS.md instructions, at least with codex/gpt-5.4.
What sorts of instructions?
Usually I just put something like "Prefer DRY code". I like to keep my AGENTS.md DRY too :)
also add "no hallucinations" and "make it works this time pretty please" while also say Claude will go to jail if does not do it right should work all the time (so like 60%)
There are of course limits to what prompting can do, but it does steer the models.
In TFA they found that prompting mitigates over-editing up to about 10 percentage points.