Hacker News

This is obviously trained on Pro 3 outputs for benchmaxxing.

Not trained on pro, distilled from it.

What do you think distilled means...?

It's good to keep the language clear, because you could pretrain/sft on outputs (as many labs do), which is not the same thing.

NitpickLawyer 5 days ago [ - ]

> for benchmaxxing.

Out of all the big4 labs, google is the last I'd suspect of benchmaxxing. Their models have generally underbenched and overdelivered in real world tasks, for me, ever since 2.5 pro came out.