Hacker News

I recently tried out LMStudio on Linux for local models. So easy to use!

What Linux tools are you guys using for image generation models like Qwen's diffusion models, since LMStudio only supports text gen.

eurekin 18 hours ago [ - ]

Practically anybody actually creating with this class of models (diffusion based mostly) is using ComfyUI. Community takes care of quantization, repackaging into gguf (most popular) and even speed optimizing (lighting loras, layers skip). It's quite extensive

MrDrMcCoy 5 hours ago [ - ]

I personally find nothing about ComfyUI to live up to that name. Node-based workflows are unruly, and you have to know in advance what you need to do for anything to work. Just poking around and figuring stuff out is nearly impossible even for technically literate but AI-inexperienced folks.

You could argue that is what pre-made workflows are for, but that doesn't work super well for users that are off the blessed path in terms of not having Nvidia hardware like everyone assumes. I personally find using stable-diffusion.cpp on the command line to be considerably easier to figure out. Last I saw, it's even shipping a usable demo web ui for those that really want one (my workflow benefits from heavier scripting, so point and click is far too slow and clunky).

embedding-shape 20 hours ago [ - ]

Everything keeps changing so quickly, I basically have my own Python HTTP server with a unified JSON interface, then that can be routed to any of the impls/*.py files for the actual generation, then I have of those per implementation/architecture basically. Mostly using `diffusers` for the inference, which isn't the fastest, but tends to have the new model architectures much sooner than everyone else.

guai888 20 hours ago [ - ]

ComfyUI is the best for stable diffusion

embedding-shape 19 hours ago [ - ]

FWIW you can use non-sd models in ComfyUI too, the ecosystem is pretty huge and supports most of the "mainstream" models, not only the stable diffusion ones, even video models and more too.

MrDrMcCoy 5 hours ago [ - ]

Stable-diffusion.cpp is where it's at if you don't care for complex installations and node-based workflows.

ilaksh 20 hours ago [ - ]

I have my own MIT licensed framework/UI: https://github.com/runvnc/mindroot. With Nano Banana via runvnc/googleimageedit

vunderba 16 hours ago [ - ]

I encourage everyone to at least try ComfyUI. It's come a long way in terms of user-friendliness particularly with all of the built-in Templates you can use.

sequence7 16 hours ago [ - ]

If you're on an AMD platform Lemonade (https://lemonade-server.ai/) added image generation in version 9.2 (https://github.com/lemonade-sdk/lemonade/releases/tag/v9.2.0).

PaulKeeble 19 hours ago [ - ]

Ollama is working on adding image generation but its not here yet. We really do need something that can run a variety of models for images.

embedding-shape 19 hours ago [ - ]

Yeah, I'm guessing they were bound to leave behind the whole "Get up and running with large language models" mission sooner or later, which was their initial focus, as investors after 2-3 years start making you to start thinking about expansion and earning back the money.

Sad state of affairs and seems they're enshittifying quicker than expected, but was always a question of when, not if.

adammarples 14 hours ago [ - ]

Stability matrix, it's a manager for models and uis and loras etc, very nice

SV_BubbleTime 14 hours ago [ - ]

LMStudio is a low barrier to entry for LLMs, for sure. The lowest. Good software!

Other people gave you the right answer, ComfyUI. I’ll give you the more important why and how…

There is a huge effort of people to do everything but Comfy because of its intimidating barrier. It’s not that bad. Learn it once and be done. You won’t have to keep learning UI of the week endlessly.

The how, go to civitai. Find an image you like, drag and drop it into comfy. If it has a workflow attached, it will show you. Install any missing nodes they used. Click the loaders to point to your models instead of their models. Hit run and get the same or a similar image. You don’t need to know what any of the things do yet.

If for some reason that just does not work for you… Swarm UI, is a front end too comfy. You can change things and it will show you on the comfy side what they’re doing. It’s a gateway drug to learning comfy.

EDIT: most important thing no one will tell you out right… DO NOT FOR ANY REASON try and skip the VENV or miniconda virtual environment when using comfy! You must make a new and clean setup. You will never get the right python, torch, diffusers, driver, on your system install.

Eisenstein 15 hours ago [ - ]

Koboldcpp has built in support for image models. Model search and download, one executable to run, UI, OpenAI API endpoint, llama.cpp endpoint, highly configurable. If you want to get up and running instantly, just pick a kcppt file and open that and it will download everything you need and load it for you.

Engine:

* https://github.com/LostRuins/koboldcpp/releases/latest/

Kcppt files:

* https://huggingface.co/koboldcpp/kcppt/tree/main