Sounds like a skill issue?
Recent image models are advancing rapidly at prompt adherence specifically, and being able to iterate on the same image is propelling them even further. Images 2.0 being the poster child of this "agentic iterative image composition" approach.
Images 2.0 isn't anywhere close to the kind of detail control I'm talking about.
It's the opposite of a skill issue. No image generator is anywhere near the ballpark of pro-level manual Photoshop or Illustrator editing for individual elements in an image.
If you don't understand this, try precisely kerning the text in a generated book cover to handle letter combinations like A and V.
This is one of the big problems with GenAI. You can do new things with it, but it's crude Dunning Kruger good-enough-if-you-don't-ask-for-more creativity.
The pros can see what most people can't, and the flaws and missing features are frustrating and obvious creatively, not just in terms of production values.
I fail to see anything other than a skill issue.
We went from "AI can't generate text that isn't at least 20% typos and it always looks like shit" to "some letter combinations aren't kerned to perfection sometimes and adjusting that with prompts is hard". In a couple of generations.