Hacker News

ilaksh 7 hours ago [ - ]

Could be amazing, but it's hard to judge if it will really work with say a 27 B model or larger. We can already get pretty good speed with a 2B model.

gaeld 6 hours ago [ - ]

thanks! we explain how it scales to larger models in the last section the OP blog post

bcjdjsndon 3 hours ago [ - ]

Shame you stopped short of actually benchmarking that scale though, eh?

gaeld 2 hours ago [ - ]

will do - we are a small team and it takes time to implement and optimize a new model, whatever the size.