> 3.6 pushed VRAM usage just out of 24GB and then you're not using a consumer GPU any more
BTW, you can buy an AMD RX 9700 with 32GB VRAM for $1200. Get two of them, and you have a quite powerful local setup. I can run Qwen 3.6 35B at around 80 tok/s and 50% GPU load (300W) and still have plenty of VRAM and power budget left over to run a smaller model for summarization, in parallel.
Highly recommend if you want to play with something that doesn't involve NVidia and/or unobtanium-class hardware.