Hacker News

If you're using 5.2 high, with all due respect, this has to be a skill issue. If you're using 5.2 Codex high — use 5.2 high. gpt-5.2 is slow, yes (ok, keeping it real, it's excruciatingly slow). But it's not the moronic caricature you're saying it is.

If you need it to be up to date with your version of a framework, then ask it to use the context7 mcp server. Expecting training data to be up to date is unreasonable for any LLM and we now have useful solutions to the training data issue.

If you need it to specify the latest version, don't say "latest". That word would be interpreted differently by humans as well.

Claude is well known at its one-shotting skills. But that's at the expense of strict instruction following adherence and thinner context (it doesn't spend as much time to gather context in larger codebases).

tomashubelbauer 3 days ago [ - ]

I am using GPT-5.2 Codex with reasoning set to high via OpenCode and Codex and when I ask it to fix an E2E test it tells me that it fixed it and prints a command I can run to test the changes, instead of checking whether it fixed the test and looping until it did. This is just one example of how lazy/stupid the model is. It _is_ a skill issue, on the model's part.

Sammi 3 days ago [ - ]

Non codex gpt 5.2 is much better than codex gpt 5.2 for me. It does everything better.

tomashubelbauer 3 days ago [ - ]

Yup, I find it very counter-intuitive that this would be the case, but I switched today and I can already see a massive difference.

Sammi 2 days ago [ - ]

It fits with the intuition that codex is simply overfitted.

tomashubelbauer 2 days ago [ - ]

Yeah I meant it more like it is not intuitive to my why OpenAI would fumble it this hard. They have got to have tested it internally and seen that it sucked, especially compared to GPT-5.2

theshrike79 3 days ago [ - ]

Codex runs in a stupidly tight sandbox and because of that it refuses to run anything.

But using the same model through pi, for example, it's super smart because pi just doesn't have ANY safeguards :D

tomashubelbauer 3 days ago [ - ]

I'll take this as my sign to give Pi a shot then :D Edit: I don't want to speak too son, but this Pi thing is really growing on me so far… Thank you!

theshrike79 2 days ago [ - ]

Wait until you figure out you can just say "create a skill to do..." and it'll just do it, write it in the right place and tell you to /reload

Or "create an extension to..." and it'll write the whole-ass extension and install it :D

prodigycorp 3 days ago [ - ]

i refuse to defend the 5.2-codex models. They are awful.

stitched2gethr 3 days ago [ - ]

Perhaps if he was able to get Claude Code to do what he wanted in less time, and with a better experience, then maybe that's not a skill he (or the rest of us) want to develop.

blitzar 3 days ago [ - ]

Talking LLMs off a ledge is a skill we will all need going forward.

keeganpoppen 2 days ago [ - ]

still a skill issue, not a codex issue. sure, this line of critique is also one levied by tech bros who want to transfer your company's balance sheet from salaries to ai-SaaS(-ery), but in what world does that automatically make the tech fraudulent or even deficient? and since when is not wanting to develop a skill a reasonable substitute for anything? if my doctor decided they didn't want to keep up on medical advances, i would find a different doctor. but yet somehow finding fault with an ai because it can't read your mind and, in response to that adversity, refusing to introspect at all about why that might be and blaming it on the technology is a reasonable critique? somehow we have magically discovered a technology to manufacture cognition from nothing more than the intricate weaving of silicon, dopants, et al., and the takeaway is that it sucks because it is too slow, doesn't get everything exactly right, etc.? and the craziest part is that the more time you spend with it, the better intuition you get for getting whatever it is you want out of it. but, yeah... let's lend even more of an ear to the head-in-sand crowd-- that's where the real thought leaders are. you don't have to be an ai techno-utopian maximalist to see the profound worthiness and promise of the technology; these things are manifestly self-evident.

prodigycorp 3 days ago [ - ]

Sure, that's fine. I wrote my comment for the people who don't get angry at an AI agents after using them for the first time within five hours of their release. For those who aren't interested in portending doom for OpenAI. (I have elaborate setups for Codex/Claude btw, there's no fanboying in this space.)

Some things aren't common sense yet so I'm trying my part to make them so.

keeganpoppen 2 days ago [ - ]

common sense has the misfortune of being less "common" than we would all like it to be. because some breathless hucksters are overpromising and underdelivering in the present, we may as well throw out the baby, the bath water, and the bath tub itself! who even wants computers to think like humans and automate jobs that no human would want to do? don't you appreciate the self-worth that comes from menial labor? i don't even get why we use tractors to farm when we have perfectly good beasts of burden to do the same labor!

jtrn 3 days ago [ - ]

Feelings are information with just as much, or more, value as biased intellectualizing.

Ask Linus Torvalds.

keeganpoppen 2 days ago [ - ]

i have absolutely no idea whatsoever what this means

miki123211 3 days ago [ - ]

TBH, "use a package manager, don't specify versions manually unless necessary, don't edit package files manually" is an instructions that most agents still need to be given explicitly. They love manually editing package.json / cargo.toml / pyproject.toml / what have you, and using whatever version is given in their training data. They still don't have an intuition for which files should be manually written and which files should be generated by a command.

prodigycorp 3 days ago [ - ]

Agree, especially if they're not given access to the web, or if they're not strongly prompted to use the web to gather context. It's tough to judge models and harnesses by pure feel until you understand their proclivities.

hyeomans 3 days ago [ - ]

Ty for the tip on context7 mcp btw

adammarples 3 days ago [ - ]

How would a person interpret the latest version of flowbite?

3 days ago [ - ]

[deleted]

jtrn 3 days ago [ - ]

Ok. You do you. I'll stick with the models that understand what latest version of a framework means.