Hacker News

That's something for us and benchmarks to decide

However it definitly isn't _just_ Kimi. The weight will be different after that 85% of extra training on top of the base model.

If those different weights are better are worse doesn't change that it's in most meaningful ways not the same as the base one.

I would encourage you to lookup their blog posts about their post training process if you want a bit more faith that they aren't running an extra 85% of compute and burning money with no-ops.

airstrike 2 hours ago [ - ]

"Just Kimi" is hyperbole, to be clear.

I don't think it's all no-ops. Still don't think it's a particularly relevant model/company/product.

I'll defer the reading until I see signal that they have something worthwhile. I've watched a couple interviews and used the product, neither of which impressed me.