I have been doing the same since GPT-3. I remember a time, probably around 4o when it started to get useful for some things like small React projects but was useless for other things like firestore rules. I think that surface is still jagged, it's just that it's less obviously useless in areas that it's weaker.

When things really broke open for me was when I adopted windsurf with Opus 4, and then again with Opus 4.5. I think the way the IDE manages the context and breaks down tasks helps extend llm usefulness a lot, but I haven't tried cursor and haven't really tried to get good at Claude code.

All that said, I have a lot of experience writing in business contexts and I think when I really try I am a pretty good communicator. I find when I am sloppy with prompts I leave a lot more to chance and more often I don't get what I want, but when I'm clear and precise I get what I want. E.g. if it's using sloppy patterns and making bad architectural choices, I've found that I can avoid that by explaining more about what I want and why I want it, or just being explicit about those decisions.

Also, I'm working on smaller projects with less legacy code.

So in summary, it might be a combination of 1, 2 and the age/complexity of the project you're working on.