My problem _still_ with all of the codex/gpt based offerings is that they think for way too long. After using Claude 4 models through cursor max/ampcode I feel much more effective given it's speed. Ironically, Claude Code feels just as slow as codex/gpt (even with my company patching through AWS bedrock). Only makes me feel more that the consumer modes have perverse incentives.
I almost never have to reprompt GPT-5-high (now gpt-5-codex-high) where I would be reprompting claude code all the time. It feels like its faster, doing more, but its taking more of the developers time by getting things wrong.
It’s great for multitasking. I’ve cloned one of the repos I work on into a new folder and use Codex CLI in there. I feed it bug reports that users have submitted, while I work on bigger tasks.