Hacker News

justinhj 6 hours ago [ - ]

We see the same with Google's Flash models. It's easier to make a small capable model when you have a large model to start from.

karmasimida 6 hours ago [ - ]

Flash models are nowhere near Pro models in daily use. Much higher hallucinations, and easy to get into a death sprawl of failed tool uses and never come out

You should always take those claim that smaller models are as capable as larger models with a grain of salt.

justinhj 4 hours ago [ - ]

Flash model n is generally a slightly better Pro model (n-1), in other words you get to use the previously premium model as a cheaper/faster version. That has value.

karmasimida 3 hours ago [ - ]

They do have value, because they are much much cheaper.

But no, 3.0 flash is not as good as 2.5 pro, I use both of them extensively, especially in translation. 3.0 flash will confidently mistranslate some certain things, while 2.5 pro will not.

justinhj 10 minutes ago [ - ]

Totally fair. Translation is one of those specific domains where model size correlates directly with quality, and no amount of architectural efficiency can fully replace parameter count.