> how many other people have encountered a problem close enough to yours and solved it somewhere on the open internet
I'm 100% sure that all our web, cc, codex or whatsoever sessions are used in the training, RL or either both.
This makes the size of the universe models know about at least one order of magnitude bigger than the open internet.
I think you have misused the term "order of magnitude" or just don't grasp the scale of the internet.
I get how this is a trueism now but I never really understood why it would be useful to scrape cc/codex sessions for training. The relative amount of human input for that is so low (isn't that why they are so loved and used?), how could it actually be useful to them? Wouldn't you wanna focus on people not using it?
It's more useful as a set of feedback on the model results. You can do sentiment analysis on the user responses to see if they found the model results useful/frustrating/etc and use that to guide future training
Because you provide them with the "problem" and the "solution" and once you have both you can scale your RL pipeline.
I think this is a rosy estimate. The vast majority of what people do with these models is just the same old shit, I would be surprised if 1% of it were genuinely novel stuff worth folding back into the training data.
Even if "is just the same old shit" they have much more data and of a much higher quality to scale the RL pipeline.