This older HN thread shows R1 running on a ~$2k box using ~512 GB of system RAM, no GPU, at ~3.5-4.25 TPS: https://news.ycombinator.com/item?id=42897205
If you scale that setup and add a couple of used RTX 3090s with heavy memory offloading, you can technically run something in the K2 class.
Is 4 TPS actually useful for anything?
That's around 350,000 tokens in a day. I don't track my Claude/Codex usage, but Kilocode with the free Grok model does and I'm using between 3.3M and 50M tokens in a day (plus additional usage in Claude + Codex + Mistral Vibe + Amp Coder)
I'm trying to imagine a use case where I'd want this. Maybe running some small coding task overnight? But it just doesn't seem very useful.