the topic of managing large dependency chains for ML/AI workloads in a reproducible has been a deep rabbit hole for us. if you are curious, here is some of the work in open domain
https://docs.metaflow.org/scaling/dependencies https://outerbounds.com/blog/containerize-with-fast-bakery