One of the most valuable things about code generation from LLMs is the ability to edit it, you have all the pieces and can tweak them after the fact. Same with normal generated text. Images, on the other hand, are much harder to modify and the times when you might want text or other “layers” is specifically where they fall apart in my experience. You might get exactly the person/place/thing rendered but the additions to the image aren’t right but it’s nearly impossible to change just the additions without losing at least some of the other image/images.
I’ve often thought “I wish I could describe what I want in Pixelmator and have it create a whole document with multiple layers that I can go back in and tweak as needed”.
Yep! Wrote it already on discord: this the first step of further integrating and making use of humans.
I think the future is something like: start draft. Turn draft into image with AI refine the boring layers. Edit the important layer.