Technical report here for those curious: https://cdn.sanity.io/files/gsvmb6gz/production/880b07220899...

Seems implementation is straightforward (very similar to everyone else, HiDream-E1, ICEdit, DreamO etc.), the magic is on data curation (which details are lightly shared).

I haven't been following image generation models closely, at a high level is this new Flux model still diffusion based, or have they moved to block autoregressive (possibly with diffusion for upscaling) similar to 4o?

Well it's a "generative flow matching model"

That's not the same as a diffusion model.

Here is a post about the difference that seems right at first glance: https://diffusionflow.github.io/

[deleted]

Diffusion based. There is no point to move to auto-regressive if you are not also training a multimodality LLM, which these companies are not doing that.

Unfortunately, nobody wants to read the report, but what they are really after is to download the open-weight model.

So they can take it and run with it. (No contributing back either).

"FLUX.1 Kontext [dev]

Open-weights, distilled variant of Kontext, our most advanced generative image editing model. Coming soon" is what they say on https://bfl.ai/models/flux-kontext

Distilled is a real downer, but I guess those AI startup CEOs still gotta eat.

The open community has a done a lot with the open-weights distilled models from Black Forest Labs already, one of the more radical being Chroma: https://huggingface.co/lodestones/Chroma

I don't doubt that people can do nice things with them. But imagine what they could do with the actual model.

I agree that gooning crew drives a lot of open model downloads.

On HN, generally, people are more into technical discussion and/or productizing this stuff. Here, it seems declasse to mention the gooner angle, it's usually euphemized as intense reactions about refusing to download it involving the words "censor"

[dead]