Yeah, if they can fit an 8B model that's really good at improving the output by thinking, running at 16K tok/s on Taalas would be mind-blowing.
Yeah, if they can fit an 8B model that's really good at improving the output by thinking, running at 16K tok/s on Taalas would be mind-blowing.
Given this and the quality of open models, it makes no sense to me that there’s a future for Anthropic et all?
Packaging a capability into a consumable form will still be business.
It's like web hosting; all the open source tools are there and free, and yet website tools, hosts, etc flourish.
It’s true, but hosting prices are still within spitting distance of rolling it yourself.
SOTA providers are expecting some level of margin. Companies everywhere have a tight eye on their AI bills right now.
The motivation is there if the models get good enough, even if it’s more painful.