In my quick experimentation for image-to-image this feels even better than GPT-4o: 4o tends to heavily weight the colors towards sepia, to the point where it's a bit of an obvious tell that the image was 4o-generated (especially with repeated edits); FLUX.1 Kontext seems to use a much wider, more colorful palette. And FLUX, at least the Max version I'm playing around with on Replicate, nails small details that 4o can miss.

I haven't played around with from-scratch generation, so I'm not sure which is best if you're trying to generate an image just from a prompt. But in terms of image-to-image via a prompt, it feels like FLUX is noticeably better.