They might optimize learning to weight novel/unexpected parts more in the future. The better the models become (the more the expect) the more value they will get from unexpected/new ideas.

Good point. But can the models even behave that way? They depend on probability. If they put a greater weight on novel/unexpected outputs don't they just become undependable hallucination machines? Despite what some people think, these models can't reason about a concept to determine it's validity. They depend on recurring data in training to determine what might be true.

That said, it would be interesting to see a model tuned that way. It could be marketed as a 'creativity model' where the user understands there will be a lot of junk hallucination and that it's up to them to reason whether a concept has validity or not.

Temperature plays a large role in fine tuning model output, you're correct that there is a theoretical sweet spot:

https://towardsdatascience.com/a-comprehensive-guide-to-llm-...

I think it's happening already. Chat GPT was able to connect my name to my project based on chess.com profile and one Hacker News post for example. It's not that hard to imagine that it learns a solution to a rare problem based on one input point. It may see one solution 1000 times an a rare solution 1 time and it can still be able to reference both.