Hacker News

jbentley1 3 days ago [ - ]

From conversation with someone from Groq, they have a custom compiler and runtime for the models to run on their custom hardware, which is why the selection is poor. For every model type they need to port the architecture to run on their compiler beforehand.

boroboro4 3 days ago [ - ]

They can't host DeepSeek because it's too big. Their chips have 230mb of memory, so it will take them ~3000 chips to host the model + (possible large) number of chips to keep kv cache. I bet it's just too hard to bring such topology online at all, and impossible to make even near to be profitable.