Yeah, I think you need at least 8gb ram unfortunately, but I tested it only on a 32gb M2, so 8gb might also not be enough.

I might create a compressed version of the model, that would work on low-ram machines.

I've worked around lower RAM machines with ONNX web models by first separating .onnx from .onnx_data, and second having scripts that split up the "layers" and shards the run (e.g. https://huggingface.co/cretz/Z-Image-Turbo-ONNX-sharded). Then you can have the runtime only run one at a time. I don't understand the details too deep, but Claude is good at writing scripts to shard onnx protos.

It froze up my computer, had to hard-boot lol

(16MB M1 Macbook, Chrome)