This is what finetuning has been all about since stable diffusion 1.5 and especially SDXL. And even something StabilityAI base models excelled at in the open weights category. (Midjourney has always been the champion, but proprietary)

Sadly with SAI going effectively bankrupt things changed, their rushed 3.0 model was broken beyond repair and the later 3.5 just unfinished or something (the api version is remarkably better), gens full of errors and artifacts even though the good ones looked great. It turned out hard to finetune as well.

In the mean time flux got released, but that model can be fried (as in one concept trained in) but not finetuned (this krea flux is not based on the open weights flux). Add to that that as models got bigger training/finetuning now costs an arm and a leg, so here we are, a year after flux got released a good finetune is celebrated as the next new thing :)

Agreed. From the article:

> Model builders have been mostly focused on correctness, not aesthetics. Researchers have been overly focused on the extra fingers problem.

While that might be true for the foundational models - the author seems to be neglecting the tens of thousands of custom LoRAs to customize the look of an image.

> Users fight the “AI Look” with heavy prompting and even fine-tuning

IMHO it is significantly easier to fix an aesthetic issue than an adherence issue. You can take a poor quality image, use ESRGAN upscalers, img2img using it as a ControlNet, run it through a different model, add LoRAs, etc.

I have done some nominal tests with Krea but mostly around adherence. I'd be curious to know if they've reduced the omnipresent bokeh / shallow depth of field given that it is Flux based.

> Model builders have been mostly focused on correctness, not aesthetics. Researchers have been overly focused on the extra fingers problem.

> While that might be true for the foundational models

Its possibly true [0] of the models from the big public general AI vendors (OpenAI, Google), its defintely not true of MJ (which, if it has an aesthetic bias to what the article describes as “the AI look” it is largely because that was a popular actively sought and prompted for look in early AI image gen to avoid the flatness bias of early models and MJ leaned very hard into biasing toward what was popular aesthetically in that and other areas as it developed. Heck, lots of SD finetunes actively sought to reproduce MJ aesthetics for a while.)

[0] but I doubt it, and I think they have also been actively targeting aesthetics as well as correctness, and the post even hints at at least part of how that reinforced the “AI look” — the focus on aesthetics meant more reliance on the LAION Aesthetics dataset to tune the models understanding of what looked good, transferring the biases of that dataset into models that were trying to focus on aesthetics.

Definitely. It's been a while since I used midjourney, but I imagine that style (and sheer speed) are probably the last remaining use cases of MJ today.

It is not just a fine-tune.