Hacker News

> Useful models are getting smaller and cheaper to run every year and it has hit a threshold at which we will see continued development of third party harnesses even without the userbase of subscription users.

As of May 2026, how much money do I need to spend to buy hardware to have a local model that is 80% as good as SOTA services for assisting me in writing code?

As for that 80%, how many minutes per LOC will I be waiting, and how many attempts per query will I be wasting while I wait for it to come up with something sensible?