Wan 2.2 is a video model people have been using to do text to image recently that I think solves this problem way better than Krea in the base model. -- https://www.reddit.com/r/comfyui/comments/1mf521w/wan_22_tex...
As others have said, you can fine-tune any model with a pretty small data set of images and captions and make your generations not look like 'AI' or all look the same.
Here's one I made a while back trained on Sony HVS HD video demos from the 80s/90s -- https://civitai.com/models/896279/1990s-analog-hd-or-4k-sony...
We've noticed that Wan 2.2 (available on Krea) + Krea 1 refinement yields _beautiful_ results. Check this from our designer, for instance: https://x.com/TitusTeatus/status/1952645026636554446
(Disclaimer: I am the Krea cofounder and this is based on a small sample size of results I've seen).
> prompts in alt
First pic (blonde woman with eyes closed) has alt text that begins:
> Extreme close-up portrait of a black man’s face with his eyes closed
copypasta mistake or bad prompt adherence? haha.
o/t: your astrophotography LoRA is very cool, I came across it before. thanks for making it!
(for others: https://civitai.com/models/890536/nasa-astrophotography-or-f...)
Thanks!
I don't know, those all still look like AI, as in, too clean.