Even in the days where this was standard, that is not the case entirely.
There is a whole other world between "released code" and "getting the results as seen in the paper".
Unfortunately. The reproducibility crisis is very much well and alive! :'( Much more to go into but it is a deep rabbit hole, indeedy. :'((((
I guess I'm saying that if there are reproducibility problems without the weights, then there's still a reproducibility problem with them. A paper with weights that magically work, when training on the same data and algorithm doesn't work is a paper that isn't reproducible.
IMO, having the weights available sometimes just papers over a deeper issue.
Training, especially on large GPU clusters, is inherently non-deterministic. Even, if all seeds are fixed.
This boils down to framework implementations, timing issues and extra cost of trying to ensure determinism (without guarantees).