I am using GPT-5.2 Codex with reasoning set to high via OpenCode and Codex and when I ask it to fix an E2E test it tells me that it fixed it and prints a command I can run to test the changes, instead of checking whether it fixed the test and looping until it did. This is just one example of how lazy/stupid the model is. It _is_ a skill issue, on the model's part.

Non codex gpt 5.2 is much better than codex gpt 5.2 for me. It does everything better.

Yup, I find it very counter-intuitive that this would be the case, but I switched today and I can already see a massive difference.

It fits with the intuition that codex is simply overfitted.

Yeah I meant it more like it is not intuitive to my why OpenAI would fumble it this hard. They have got to have tested it internally and seen that it sucked, especially compared to GPT-5.2

Codex runs in a stupidly tight sandbox and because of that it refuses to run anything.

But using the same model through pi, for example, it's super smart because pi just doesn't have ANY safeguards :D

I'll take this as my sign to give Pi a shot then :D Edit: I don't want to speak too son, but this Pi thing is really growing on me so far… Thank you!

Wait until you figure out you can just say "create a skill to do..." and it'll just do it, write it in the right place and tell you to /reload

Or "create an extension to..." and it'll write the whole-ass extension and install it :D

i refuse to defend the 5.2-codex models. They are awful.