Everything is text-to-X because it's less friction and therefore more fun. It's more a marketing thing.

There are many workflows for using generative AI to adhere to specific functional requirements (the entire ComfyUI ecosystem, which includes tools such as LoRAs/ControlNet/InstantID for persistence) and there are many startups which abstract out generative AI pipelines for specific use cases. Those aren't fun, though.