The catch is that the benefits of open vs non-open don't translate neatly from software to models. If software is binary-only, is it exceedingly difficult to change it in any kind of substantial way (you can change the machine code directly, of course, but the very nature of the format makes this very limited). OTOH with a large language model with open weights but without open training data - the closest equivalent to open source for software - you can still change its behavior very substantially with finetuning or remixing layers (from different models even!).

> OTOH with a large language model with open weights but without open training data - the closest equivalent to open source for software - you can still change its behavior very substantially with finetuning or remixing layers (from different models even!).

The closest thing to open source would be to have open training data. The weights are the binary, the training date is the source and the process of getting the weights is the compilation process.

Fintuning or whatever is just modding the binaries. Remixing different layers is creating a workflow pipeline by combining different functions of a binary software package together with components from other binary software packages.