> I think the Chinese government either already has, or will soon, grasp that if they train the models that people use they dictate what people believe (at least around the margins where that's malleable), and they will happily throw resources at that.
that doesn't require the model to be SOTA, it can be just a compact model capable of running on some inexpensive hardware. that is vastly different from SOTA models like Mythos which can potentially disrupt lots of things.
Of course it requires SOTA, people will always choose better models over some compact thing that is obviously more limited. You can't control the truth with models nobody wants to use.
People choose SOTA right now because of the heavily subsidised model subscriptions. People aren't going to pay 20x the price for a model that's maybe 10% better.
And the fact that "better" is highly subjective and domain/task/vibe-specific
Why do I want the model I use for coding to know Shakespeare or vice versa?
Because you communicate with it using natural language and real-world references and descriptions of what you want, you use emotion and emphasis (especially when re-prompting), you use examples and illustrative stories and common expressions. Understanding and interpreting all of that and replying in kind, to some degree, requires a large body of non-computation, cultural knowledge, or else the prompts are just meaningless words, and the replies will look like compiler output.
That sounds intuitively true, but I’m not convinced that it is actually the case. I don’t think we know enough about neural network training to say what training and how many parameters are necessary for what kind of performance on which tasks. To me it looks like we currently guess that more is better and try to throw as much compute and data at the problem as is economically feasible. There is little incentive for companies to invest into small model research since their moat is huge models that require special hardware to run.
This is why: https://www.emergent-misalignment.com/
Small models are the future.