Hacker News

BatteryMountain 11 hours ago [ - ]

So, what if, we build a stack/set of transistors in same shape as a trained model? It would eliminate most of the software stack too and should run very fast. No memory/gpu required, the chip acts as both storage and processing device, purpose built to be physical model of a trained model.

tomtom1337 9 hours ago [ - ]

This is literally what talaas has done with chatjimmy.ai.

Try it, it's llama 3.1 8B at 16000 tokens per second.

chatjimmy.ai https://taalas.com/the-path-to-ubiquitous-ai/

jupr 5 hours ago [ - ]

Wow that incredibly fast. I like this outcome more than centralized datacenters.

mr_toad 5 hours ago [ - ]

But it can only run that model, so it will be outdated in a few years at best.

rusk 8 hours ago [ - ]

There’s lots of things you can do in hardware that could be done in software but cost. FPGA should have solved this long ago, but apparently the guys who own the IP want to make it as hard as possible to use it …