Hacker News

NotSuspicious 15 hours ago [ - ]

The interesting thing about models this small is they should be able to be put on a single Taalas chip (the HC1 already runs a Llama 3.1 8B model). We're already at the point where half-decent reasoning could be run on an ASIC (and at mind-boggling speeds).

pants2 14 hours ago [ - ]

Yeah, if they can fit an 8B model that's really good at improving the output by thinking, running at 16K tok/s on Taalas would be mind-blowing.

le-mark 10 hours ago [ - ]

Given this and the quality of open models, it makes no sense to me that there’s a future for Anthropic et all?

james_marks 8 hours ago [ - ]

Packaging a capability into a consumable form will still be business.

It's like web hosting; all the open source tools are there and free, and yet website tools, hosts, etc flourish.

WhiteDawn 7 hours ago [ - ]

It’s true, but hosting prices are still within spitting distance of rolling it yourself.

SOTA providers are expecting some level of margin. Companies everywhere have a tight eye on their AI bills right now.

The motivation is there if the models get good enough, even if it’s more painful.