I mind, a lot. That is why I've built a cheap (in relative terms) rig that can run models up to approximately 600B parameters, although only extremely slowly once the model spills out of the GPUs. I would much rather be able to run open LLMs slowly than not at all.

What hardware? I have been considering doing the same.