This looks very interesting but I don't really understand what he has done here. Can someone explain the process he has gone through in this analysis?

He presented an empty prompt to gpt OSS and let it run many times. Through temperature, the results vary quite a lot. He sampled the results.

Feeding an empty prompt to a model can be quite revealing on what data it was trained on

Not an empty prompt but a one-token prompt:

>> i sample tokens based on average frequency and prompt with 1 token

https://x.com/iamgrigorev/status/1953919577076683131