Hacker News

Yeah, they're clearly just starting out and just shipped their very first proof of concept. But to me, their plans seem generally reasonable https://taalas.com/the-path-to-ubiquitous-ai/, and like I wrote, if this kind of thing succeeds and could become some kind of cheaply producible commodity component, I think there's huge value in that. Alas, maybe not as a frontier model replacement, but say 10 years from now you can drop a cheap raspberry pi like device in your Lan and have a fast local engine for things like text sentiment analysis, text summarisation, voice recognition, basic vision and things like that, that would be pretty exciting to me (but maybe as you outlined, impossible in practice)

There is a reasonable kernel of an idea here, but only if you dial expectations WAY back. The 10 years speculation is just wrong though. Even in 10 years, their 8B param model isn't going to be in consumer devices.

6nm is just 7nm++ and the process will be a decade old in a few months. In the decade since, we've only had a slightly less than 3x increase in transistor density and that's including EUV, BSPD, and GAAFET (which means progress is likely going to slow down even more).

Even if we hit another 3x increase, their 815mm2 design will still be a bit over 90mm2. For comparison, the entire M5 Pro/Max CPU die is just 61.7nm.

If our current progress somehow holds (not likely), even 20 years from now the 8B model would be 30mm2. You need 30 years of dead consistent progress to get it down to an includable 10mm2.

As you can see, this doesn't make sense to invest in. As to the stuff like voice recognition or basic vision, these can often fit within 100m parameter models which would be around 10mm2 on their current 6nm design. That's doable today in custom edge computing devices.

The other possible use is cheap fallback models for AI companies. Moving to N2 and shrinking chips to 600mm2 to improve yields a bit would give about 50B parameters with 3 chips plus another FPGA-ish programmable chip for continuing training and interconnects for everything. You'd need hundreds of thousands of chips produced for that exact AI model just to get costs below $100,000 per board.

That seems like a lot of money for the AI model you are essentially giving away, but maybe it still beats the power and price of GPU server racks.