Hacker News

thunderbird120 8 hours ago [ - ]

That's what it's running on. It's optimized for very high throughput using Cerebras' hardware which is uniquely capable of running LLMs at very, very high speeds.