Hacker News

mgreg 2 years ago [ - ]

I very much appreciate that the authors not only published their code (https://github.com/llm-random/llm-random) but included the dataset they used (available on Huggingface - https://huggingface.co/datasets/c4) as well as the training process and hyperparameters they used so others can replicate and build on their work. The only thing really missing is the weights which would be nice to have on huggingface as well.

swells34 2 years ago [ - ]

It's very confusing to me that you are praising the authors of a published scientific paper for almost making their work reproduceable.

chaxor 2 years ago [ - ]

If we had a proper data version control, wherein the git commit hash was tied directly to the output data hash and hosted on IPFS (and the make system checked ipfs like it does local files for the cache) then it would be absolutely reproducible.

And the wonderful thing is, every person that used git clone on this repo and ran it would be serving the NN weights.

But alas, this unfortunately hasn't been done yet.

astrange 2 years ago [ - ]

That's not what confusing means.

sitkack 2 years ago [ - ]

Feigned confusion

jsight 2 years ago [ - ]

The weights aren't needed to make it reproducable. The code and training data are needed. Hopefully if you used those, you'd ultimately reach the same result.

tbalsam 2 years ago [ - ]

Even in the days where this was standard, that is not the case entirely.

There is a whole other world between "released code" and "getting the results as seen in the paper".

Unfortunately. The reproducibility crisis is very much well and alive! :'( Much more to go into but it is a deep rabbit hole, indeedy. :'((((

jsight 2 years ago [ - ]

I guess I'm saying that if there are reproducibility problems without the weights, then there's still a reproducibility problem with them. A paper with weights that magically work, when training on the same data and algorithm doesn't work is a paper that isn't reproducible.

IMO, having the weights available sometimes just papers over a deeper issue.

abdullin 2 years ago [ - ]

Training, especially on large GPU clusters, is inherently non-deterministic. Even, if all seeds are fixed.

This boils down to framework implementations, timing issues and extra cost of trying to ensure determinism (without guarantees).

zcw100 2 years ago [ - ]

Random initialization would keep you from producing the exact same results.

jsight 2 years ago [ - ]

Yes, but there's a difference between exact results and reproducible results. I should get similar performance, otherwise there is an issue.

jakderrida 2 years ago [ - ]

It's a sad world where our standards are that low. But they are that low for good reasons.

theLiminator 2 years ago [ - ]

If anything CS papers are far more reproducible than most papers. Maybe that is sad, but I think most scientists and researchers are trying their best.

mgreg 2 years ago [ - ]

I understand where you're coming from but what they provided DOES make their work reproducible. You can use the data, source code, and recipe to train the model and get the weights.

It would be nice if they provided the weights so it could be USABLE without the effort or knowledge required.

We (I think) would all like to see more _truly_ open models (not just the source code) that enable collaboration in the community.

kevindamm 2 years ago [ - ]

Only if they also include the random seed they used for the initial weights, otherwise you may be able to reproduce similar performance but will not likely obtain their same weights.

CGamesPlay 2 years ago [ - ]

But that's a lot like saying that my recipe for muffins isn't reproducible because it doesn't say exactly which batch of which field my flour comes from. I mean, of course you won't get the same muffins, but if your muffins taste just as good it's still a win.

blovescoffee 2 years ago [ - ]

If this work is valuable, the random seed shouldn't affect the outcome thaaat much.