Hacker News

I've done a fair amount of fine-tuning for conversational voice use cases. Smaller models can do a really good job on a few things: routing to bigger models, constrained scenarios (think ordering food items from a specific and known menu), and focused tool use.

But medium-sized and small models never hit that sweet spot between open-ended conversation and reasonably on-the-rails responsiveness to what the user has just said. We don't know yet know how to build models <100B parameters that do that, yet. Seems pretty clear that we'll get there, given the pace of improvement. But we're not there yet.

Now maybe you could argue that a kid is going to be happy with a model that you train to be relatively limited and predictable. And given that kids will talk for hours to a stuffie that doesn't talk back at all, on some level this is a fair point! But you can also argue the other side: kids are the very best open-ended conversationalists in the world. They'll take a conversation anywhere! So giving them an 8B parameter, 4-bit quantized Santa would be a shame.