Hacker News

Yep - I'd say either that or 4x 5090 is a great entry point to running local models "well". Two of them would be even better. If you don't have $12-24k to spend, you can try your hand with tiny models or quants or slow speeds, but it will be a much more painful experience. You're already giving up a lot by dropping down from frontier models - you're giving up even more by trying to squeeze them into little RAM and compute.

Prices will fall in the next few years. Maybe just play with the tiny toy models for now to learn how they work, then keep using API providers until they do.