Hacker News

These are models that can be run locally BTW. Just get enough hardware for your throughput requirements, have it grind on multiple batches of tokens 24x7 to get reasonable utilization (keeping the cloud for time-sensitive uses) and that's it, no more rug pulls.