Hacker News

I'm not an expert on this tech, so I could be talking out my ass, but what you are saying here doesn't ring completely true to me. I'm an avid consumer of stable-diffusion based models. The community is very easily able to train adaptations to the network that push it in a certain direction, to the point you consistently get the model to produce specific types of output (e.g. perfectly replicating the style of a well known artist).

I have also seen people train "jailbreaks" of popular open source LLMs (e.g. Google Gemma) that remove the condescending ethical guidelines and just let you talk to the thing normally.

So all in all I am skeptical of the claim that there would be no value in having access to the training data. Clearly there is some ability to steer the direction of the output these models produce.