Hacker News

Yes, I think they claim they are a far better dice roller in randomness and speed and that this is very important. The first might be true, but I don’t see why second is in any way true. These all need to be true for this company to make sense :

1. They build a chip that does random sampling far better than any GPU (is this even proven yet?)

2. They use a model architecture that utilizes this sampling advantage which means most of the computation must be concentrated at sampling. This might be true for energy based models or some future architecture we have no idea about. AFAIK, this is not even true for diffusion.

3. This model architecture must outcompete autoregressive models in economically useful tasks, whether language modeling or robotics etc, right now auto regressive transformers is still king across all tasks.

And then their chip will be bought by hyper scalers and their company will become successful. There is just so many if’s outside of them building their core technology that this whole project makes no sense. And you can say that this is true for all startups, I don’t think that’s the case, this is just ridiculous.