Hacker News

My ideal situation is that the system should maintain authoritative versions of every package and version that is ever requested, and they should not need to be shipped. Multiple versions of a package should coexist.

    /usr/lib/python3.12/torch/2.1.0/
    /usr/lib/python3.12/torch/2.1.1/
    /usr/lib/python3.12/torch/2.1.2/

When a package requests 2.1.1 it fetches it right out of there, installing from PyPI if it doesn't.

The same should be true of JS and even C++. When a C++ app's deb package wants libusb==1.0.1 it should NOT overwrite libusb-1.0.0 that is on the system, it should coexist with it and link to the correct one so that another app that wants libusb-1.0.0 should still be able to use it.

> Fortunately the Python community is much more serious about making deps that work together

This is very not true at least in ML. I have to create a new conda environment for almost every ML paper that comes out. There are so many papers and code repos I test every week that refuse to work with the latest PyTorch, and some that require torch<2.0 or some bull. Also, xformers, apex, pytorch3d, and a number of other popular packages require that the cuda version that is included with the "torch" Python package matches the cuda version in /usr/local/cuda AND that your "CC" and "CXX" variables point to gcc-11 (NOT gcc-12), or else the pip install will fail. It's a fucking mess. Why can't gcc-12 compile gcc-11 code without complaining? Why does a Python package not ship binaries of all C/C++ parts for all common architectures compiled on a build farm?

> one env per project

That causes 50 copies of the exact same version of a 1GB library to exist on my system that are all obtained from the same authority (PyPI). I have literally 50 copies of the entire set of CUDA libraries because every conda environment installs PyTorch and PyTorch includes its own CUDA.

I'm not asking the OS to maintain this, but rather the package manager ("npm" or "pip" or similar) should do so on a system-wide basis. "python" and "pip" should allow for 1 copy per officially-released version of each package to live on the system, and multiple officially-released version numbers to coexist in /usr/lib. If a dev version is being used or any version that deviates from what is on PyPI, then that should live within the project.

BiteCode_dev a year ago [ - ]

I'm assuming by system you mean OS, which is a terrible, terrible idea. Dev stack and system libs should not coexist, especially because system libs should be vetted by the OS vendor, but you can't ask them to do that for dev libs.

> I have to create a new conda environment for almost every ML paper that comes out

That's how it's supposed to work: one env per project.

As for the rest, it's more telling about the C/C++ community building the things bellow the python wrappers.

dheera a year ago [ - ]

skeledrew a year ago [ - ]

Actually conda creates hardlinks for the packages that it manages. Found this out a few weeks ago when I tried migrating my envs to another system with an identical hierarchy and ended up with a broken mess.

forrestthewoods a year ago [ - ]

> but rather the package manager ("npm" or "pip" or similar) should do so on a system-wide basis.

I basically agree with this. With the caveat that programs should not use any system search paths and packages should be hardlinked into the project directory structure from a centralized cache. This also means that a dev version looks identical to a centralized version - both are just directories within the project.

bomewish a year ago [ - ]

Are you just describing something close to Nix?? In any case, Nix solves a lot of these problems.

Kind of, but not really. Nix is extremely complicated. Programs / projects including their dependencies is exceedingly simple.

Also, Windows is my primary dev environment. Any solution must work cross-platform and cross-distro. Telling everyone to use a specific distro is not a solution.

It is complicated... but honestly I have found claude 3.5 to just 'fix it'. So you hardly have to spend much time spelunking. You just give it all your dependencies and tell it what you want. It'll whip up a working flake in a few iterations. Kinda magic. So yeah when you can abstract out the complexity it moves the needle enough to make it worth it.

xyzsparetimexyz a year ago [ - ]

Nix != NixOS. It runs on WSL: https://github.com/nix-community/NixOS-WSL

Less than zero interest in WSL.

Nix fans are becoming as obnoxious as Rust fans. And I say that as an times annoying Rust fan.

Ah, sorry I misunderstood.

Yes, it would be nice to have that by default.

In fact, it's what uv (https://github.com/astral-sh/uv) does, and one of the reasons it's so fast and became so popular so quickly.

Astral for the win.

olejorgenb a year ago [ - ]

I don't think that's true for the exact same version: https://stackoverflow.com/a/57718049 (ie. it's deduplicated)

ericjmorey a year ago [ - ]

ML researchers might be thinking that their paper will be obsolete next month so why bother taking time to make their coding environment reproducible.

setopt a year ago [ - ]

It’s not the researcher’s fault if the libraries they use make breaking changes after a month; proof-of-concept code published with a paper is supposed to be static, and there’s often no incentive for the researcher to maintain it after publication.

At this point, venvs are the best workaround, but we can still wish for something better. As someone commented further up, being able to “import pytorch==2.0” and have multiple library versions coexist would go a long way.