I hadn't paid any attention to rust before uv, but since starting to use uv, I've switched a lot of my performance-sensitive code dev to rust (with interfaces to python). These sorts of improvements really do improve my quality of life significantly.
My hope is that conda goes away completely. I run an ML cluster and we have multi-gigabyte conda directories and researchers who can't reproduce anything because just touching an env breaks the world.
You might be interested in pixi, which is roughly to conda as uv is to pip (also written in Rust, it reuses the uv solver for PyPI packages)
Pixi has also been such a breathe of fresh air for me. I think it's as big of a deal as UV (It uses UV under the hood for the pure python parts).
It's still very immature but if you have a mixture of languages (C, C++, Python, Rust, etc.) I highly recommend checking it out.
Yep, pixi is game changing. Especially for AI/ML, the ability to deal with non-python dependencies in nearly as fast a way as `uv` is huge. We have some exciting work leveraging the lower level primatives pixi uses we hope to share more about soon.
Pixi is what FreeCAD is now using. (Along with Rattler).
It makes building FreeCAD pretty trivial, which is a huge deal considering FreeCAD’s really complex Python and non-python, cross-platform dependencies.
This is something that uv advocates should pay attention to, there are always contexts that need different assumptions, especially with our every growing and complex pile of libraries and systems.
This seems to pretty much cover the same use cases as Mise. Is that true?
The main difference between mise and pixi is an ability to subscribe to conda channels and build (in an extremely fast way) conda environments, bypassing or eliminating most of the conda frustration (regular conda users know what I mean). mise allows to install asdf tools primarily (last I checked).
On the python front, however, I am somehow still an old faithful - poetry works just fine as far as I was every concerned. I do trust the collective wisdom that uv is great, but I just never found a good reason to try it.
I wish the Python ecosystem would just switch to Rust. Things are nice over here… please port your packages to crates.
The unspoken assertion that Rust and Python are interchangeable is pretty wild and needs significant defense, I think. I know a lot of scientists who would see their first borrow checker error and immediately move back to Python/C++/Matlab/Fortran/Julia and never consider rust again.
I've never used a more hostile language than rust. Some people hate python and I can't understand why but such is life. One mans meat....
For me uv seems to solve some of the worst pain points of Python, which is great since I have to work with it. I think for a lot of people the hate comes in when they have to maintain or deploy Python code in scenarios that Python and its libraries wasn't designed to do. Some stuff just makes Python seem like an "unserious" programming language to me:
1. Installation & dependencies: Don't install Python directly, instead install pyenv, use pyenv to install python and pip, use pip to install venv, then use venv to install python dependencies. For any non-trivial project you have to be incredibly careful with dependency management, because breaking changes are extremely common.
2. Useless error messages: I cannot remember outside of trivial examples without external packages when the error message I got was actually directly pointing towards the issue in the code. To give a quick example (pointing back to the point above), I got the error message "ImportError: cannot import name 'ChatResponse' from 'cohere.types'". A quick google search reveals that this happens if a) the cohere API-Key isn't set in ENV or b) you use langchain-cohere 0.4.4 with cohere 5.x, since the two aren't compatible.
3. Undisciplined I/O in libraries: Another ML library I recently deployed has a log-to-file mode. Fair enough, should be disabled before k8s deployment no biggie. Well, the library still crashes because it checks if it has rwx-permissions on a dir it doesn't need.
4. Type conversions in C-interop: Admittedly I was also on the edge of my own capabilities when I dealt with these issues, but we had issues with large integers breaking when using numpy/pandas in between to do some transforms. It was a pain to fix, because Python makes it difficult to understand what's in a variable, and what happens when it leaves Python.
1. and 4. are mainly issues with people doing stuff in Python it wasn't really designed to do. Using Python as a scripting language or a thin (!) abstraction layer over C is where it really shines. 2. and 3. have more to do with the community, but it is compounded by bad language design.
1. is true, yup people have been ragging on the python install superfund site problem for years, but the rest of those are entirely 3rd party library issues. It's like saying Windows is not a serious operating system because you installed a buggy application.
2. I've used a ton of languages and frankly Python has the best tracebacks hands-down, it's not even close. It's not Python's fault a 3rd party library is throwing the wrong error.
3. Again, why is bad language design a library can do janky things with I/O?
4. FFI is tricky in general, but this sounds like primarily a "read the docs" problem. All of the major numeric acceleration libraries have fixed sized numbers, python itself uses a kind of bigint that can be any size. You have to stay in the arrays/tensors to get predictable behavior. This is literally python being "a thin abstraction layer over C."
I'm deliberately not differentiating between the language, the tool-chain, the libraries and the community. They are all closely connected, and in the end you're always buying into the bundle.
2. I would argue that the ubiquity of needing stack traces in Python is the main problem. Why are errors propagating down so deep? In Rust I know I'm in trouble when I am looking at the stack trace. The language forces you to handle your errors, and while that can feel limiting, it makes writing correct and maintainable code much more likely. Python lets you be super optimistic about the assumptions of your context and data - which is fine for prototyping, but terrible for production.
3. I agree that this isn't directly a language design issue, but there's a reason I feel the pain in Python and not in Rust or Java. Dynamic typing means you don't know what side effects a library might have until runtime. But fundamentally it is a skill issue. When I deploy the code of Java, Go or Rust people, they generally know I/O is important, and spent the necessary time thinking about it. JS, Python or Ruby devs don't.
4. The issue is that Python's integer handling sets an expectation that numbers "just work," and then that expectation breaks at the FFI boundary. And once you're of the trodden path, things get really hard. The cognitive load of tracking which numeric type you're in at any moments sucks. I completely agree that this was a skill issue on my part, but I am quite sure that I would not have had that problem if it was in a properly type-set, compiled language.
I do think some module writers get overexcited about using some dynamic features and it can be hard to understand what's going on. On the other hand....
Dynamic languages let you do a lot of things with meta classes and monkey-patching that allow you to use a library without needing to build your own special version of it. With some C or C++ library that did something bad (like the logging one you mentioned) there's nothing for it but to rebuild it. With Python you may well be able to monkey patch it and live to fight another day.
It is great when you're dealing with 3rd party things that you either don't have the source code for or where you cannot get them to accept a patch.
“Python lets me do horrifying things no one should ever do in production code” isn’t the flex you think it is.
Why should no-one ever do them? They're useful. :-) FastAPI uses this stuff to make itself easier to use for example. They're things I would have killed for when I was writing C++.
> Dynamic typing means you don't know what side effects a library might have until runtime.
Static typing (in most industrially popular languages) doesn't tell you anything about side effects, only expected inputs and return values.
I don’t really hate python but would absolutely never use it as a large code base main language. I think what people hate is other people trying to use a scripting language like Python in places where you have large code bases and large teams. Scripting languages in general are terrible for that as they give you almost no compile time guarantees about anything! But I always thought that for people whose main job is not programming and whose scripts don’t get larger than a couple of thousand lines, python is a good choice… though Lisp would perhaps be even better if historically it had gotten the huge mindshare and resulting ecosystem.
It has been repeatedly seen as a scripting language because people saw it as a replacement for perl.
Perl was the language of the big hack. It had very little to offer in the way of abstractions and OO. So you were kind of excused from writing well structured code and people wrote a lot of janky, untestable stuff in perl and carried on doing than in python.
In python, to get good code you absolutely have to have unit tests and some kind of integration tests. I worked on a roughly 50,000 line python program. It wasn't bad even though we didn't have type hints then. Occasionally we did discover embarassing bugs that a statically typed language would not have permitted but we wrote a test and closed that door.
Right, I know it is possible, but I have to ask why?? IF you can write a 50,000-line program, you probably could handle a better language for that. I'm on the side of wishing my languages did a lot more for me than they currently do! Rust is a good step, but I do find it taxing to write... I am currently considering some functional languages like Gleam (it compiles Erlang + JS and looks a bit like Elm, which is awesome but frontend only) and even more research stuff like Flix.
But day to day, I am happy to write Kotlin and Dart :D they give you amazing tooling and just good enough type guarantees.
Anyway, good to have a bunch of options :).
Because it isn’t a statically typed language, is terribly, terribly slow, has horrible production environments and has a lot of quirks?
I don’t understand why anybody would ever develop anything in Python other than “I want to write software but can’t be arsed to follow software design principles”, with all the mess that follows from it.
Congrats on finding the reason: its easier in python
> please port your packages to crates.
This is such a deeply unserious take. Do you have hundreds of thousands of hours to give out for free? No?
As far as I get it, conda is still around because uv is focused on python while conda handles things written in other languages. Unless uv gets much more universal than expected, conda is here to stay.
There is also pixi (which uses uv for the python side of things) which feels like uv for conda.
Pixi is great! It doesn't purely use uv though. I just love it. It solves "creating a repo that runs natively on any developer's PC natively" problem quite well. It handles different dependency trees per OS for the same library too!
conda (and its derivatives that are also “conda” now), and conda-forge specifically, are the best ways to install things that will work across operating systems, architectures, and languages - without having to resort to compiling everything.
Want to make sure a software stack works well on a Cray with MPI+cuda+MKL, macOS, and ARM linux, with both C++ and Python libraries? It’s possible with conda-forge.
Conda is hell for multi operationg systems projects. Its lock file is OS dependent. You can't commit it and hope it will work anywere.
It is probably the easiest way to install a lot of binary dependencies, good for who doesn't have experience with sofware development and don't care with reproductbility.
You generate multiple lock files for each OS/Arch variant. You specify the dependency versions in a different file (may also be in a recipe)
But either you're not doing anything that is OS specific (and then you probably could just use pip), or the OS does make a difference, and hence you need to reflect that in the lock file.
For me the best way to install things across operating systems has been nix. I wish it was more popular in the ML community.
You can do all of the above with Wheels.
Can you show me someone who has packaged log4cxx in a wheel? Is it in pip?
Arbitrary examples, I know, but I moved a large software that was truly mixed C++ and python project to conda-forge and all sorts of random C++ dependencies were in there, which drastically simplified distribution and drastically reduced compile time.
If I had done it today, it might be nix+bazel, or maybe conda+bazel, but maintaining a world of C++ libraries for distribution as wheels does not sound like fun - especially because nobody is doing that work as a community now
I wrap Rust/CUDA programs in wheels. I've packaged arbitrary binaries in wheels, like software that's not directly related to Python. Can't say it works for everything, but I suspect so. You just run `maturn build`, then it can be installed with pip.
Conda is like Jquery or Bootstrap: It was necessarily before the official tools evolved. Now we don't need them any more, but they still are around for legacy reasons. You still need it for example, for some Moleular dynamics packages, but that's due to the package publishers choosing it.
Except the ONE annoying quirk that certain major projects and repos let their conda distribution get stale.
I work professionally in ML and have not had to touch conda in the last 7 years. In an ML cluster, it is hopefully containerized and there is no need for that?
Very common in education/research systems. Even the things which are containerised often have conda in them.
> I work professionally in ML and have not had to touch conda in the last 7 years. In an ML cluster, it is hopefully containerized and there is no need for that?
I wish my life had been like this. Unfortunately I always appear to end up needing to make this stuff work for everyone else (the curse of spending ten years on Linux, I suppose).
But then ML is a very broad church, and particularly if you're a researcher in a bigger company then I could see this being true for lots of people (again, i wish this was me).
It's still used in edu and research. Haven't seen it in working environments in quite some time as well.
At least on my cluster, few if any workloads are containerized. We also have an EKS where folks run containerized, but that's more inference and web serving, rather than training.
It would be nice indeed if there was a good solution to multi-gigabyte conda directories. Conda has been reproducible in my experience with pinned dependencies in the environment YAML... slow to build, sure, but reproducible.
I'd argue bzip compression was a mistake for Conda. There was a time when I had Conda packages made for the CUDA libraries so conda could locally install the right version of CUDA for every project, but boy it took forever for Conda to unpack 100MB+ packages.
It seems they are using zstd now for .conda packages, eg, bzip is obsoleted, so that should be faster.
uv does it by caching versions of packages so they can be shared across projects/environments. So you still have to store those multi-gig directories but you don't have so much duplication. If conda could do something similar that would be great
Have you found it easy to write rust modules with python interfaces? What tools do you recommend?
PyO3 + Maturn is fine, but it's a bit tedious if you have a big API. Some annoyances:
If you're maintaining a single lib and want to expose it to Python, it's fine once you find a pattern that works. If you have a network and/or your API is large or not stable, it's a mess. (but doable)PyO3 and Maturin are excellent. I've been maintaining a Python-module-written-in-Rust for several years now, and it's been quite smooth.
I'd be interested in this too. I know it's possible, but haven't found a good guide on how to do it well and manage the multi-lang complexity.
Many people use PyO3 for that
the topic of managing large dependency chains for ML/AI workloads in a reproducible has been a deep rabbit hole for us. if you are curious, here is some of the work in open domain
https://docs.metaflow.org/scaling/dependencies https://outerbounds.com/blog/containerize-with-fast-bakery
As a person who has successfully used uv for ml workloads, I'm curious what makes you still stay with Conda.
Conda is much, much better for the C/Fortran/C++ parts of data science/machine learning workloads.
Like, I had real issues with GDAL and SQLite/spatialite on MacOS (easy on Linux) which uv was of no help with. I ended up re-architecting rather than go back to conda as uv is just that good (and it was early stage so wasn't that big of a deal).
I stay with it because last time I tried uv it was still directory/project focused vs. environment. With conda it doesn't matter where I am, I can activate any of my environments and run code
cuda-nvcc was a blocker for us but it looks like a stable version is coming to pypi
Have you figured out a good way to manage CUDA dependencies with uv?
CUDA is part of our cluster install scripts, we don't manage that with uv or conda. To me, that should be system software that only gets installed once.
What do you do when a new CUDA version is released?
https://docs.astral.sh/uv/guides/integration/pytorch/#using-...
does that fit the bill?
Not the OP but does this actually package CUDA and the CUDA toolchain itself or just the libraries around it? Can it work only with PyTorch or "any" other library?
Conda packaging system and the registry is capable of understanding things like ABI and binary compatibility. It can resolve not only Python dependencies but the binary dependencies too. Think more like dnf, yum, apt but OS-agnostic including Windows.
As far as I know, (apart from blindly bundling wheels), neither PyPI nor Python packaging tools have the knowledge of ABIs or purely C/C++/Rust binary dependencies.
With Conda you can even use it to just have OS-agnostic C compiler toolchains, no Python or anything. I actually use Pixi for shipping an OS-agnostic libprotobuf version for my Rust programs. It is better than containers since you can directly interact with the OS like the Windows GUI and device drivers or Linux compositors. Conda binaries are native binaries.
Until PyPI and setuptools understand the binary intricacies, I don't think it will be able to fully replace Conda. This may mean that they need to have an epoch and API break in their packaging format and the registry.
uv, poetry etc. can be very useful when the binary dependencies are shallow and do not deeply integrate or you are simply happy living behind the Linux kernel and a container and distro binaries are fulfilling your needs.
When you need complex hierarchies of package versions where half of them are not compiled with your current version of the base image and you need to bootstrap half a distro (on all OS kernels too!), Conda is a lifesaver. There is nothing like it.
No, it’s PyTorch built against a particular version of CUDA. You need to install that on your system first.
If I find myself reaching a point where I would need to deal with ABIs and binary compatiblity, I pretty much stop there and say "is my workload so important that I need to recompile half the world to support it" and the answer (for me) is always no.
Well handling OS-dependent binary dependency is still unsolved because of the intricate behavior of native libraries and especially how tightly C and C++ compilers integrate with their base operating systems. vcpkg, Conan, containers, Yocto, Nix all target a limited slice of it. So there is not a fully satisfactory solution. Pixi comes very close though.
Conda ecosystem is forced to solve this problem to a point since ML libraries and their binary backends are terrible at keeping their binaries ABI-stable. Moreover different GPUs have different capabilities and support different versions of the GPGPU execution engines like CUDA. There is no easy way out without solving dependency hell.
If you’re writing code for an accelerator, surely you care enough to make sure you can properly target it?
What about nix?
Doesn't work on Windows.
It is also quite complex and demands huge investment of time to understand its language which isn't so nice to program in it.
The number of cached combinations of various ABI and dependency setting is small with Nix. This means you need source compilation of a considerable number of dependencies. Conda generally contains every library built with the last 3 minor releases of Python.
Obligatory: Not only rust would be faster than python, but Rust definitely makes it easy with Cargo. Go, C, C++ should all exhibit the performance you are seeing in uv, if it had been written in one of those languages.
The curmudgeon in me feels the need to point out that fast, lightweight software has always been possible, it's just becoming easier now with package managers.
I've programmed all those languages before (learned C in '87, C++ in 93, Go in 2015 or so) and to be honest, while I still love C, I absolutely hate what C++ has become, Go never appealed to me (they really ignored numeric work for a long time). Rust feels like somebody wanted to make a better C with more standard libraries, without going the crazy path C++ took.
Also this. I liked C (don't use it now, right now it is mostly Java) but C++ didn't appeal to me.
Rust is for me similar to C just like you wrote, it is better, bigger but not the overwhelming way like C++ (and Rust has cargo, don't know if C++ has anything).
I actually got interested in Rust because its integer types and the core data structures looked sane, instead of this insanity: https://en.cppreference.com/w/cpp/types/integer.html . Fluid integer types are evil.
I stayed for the native functional programming, first class enums, good parts of C++ and the ultimate memory safety.
OO is supposed to make life easier but C++ exposes all the complexity of the implementation to you. Its approach to hiding complexity is to shove it partially under a carpet with sharp bits sticking out.
That is exactly how I feel about it. I've always loved C for it's simplicity and Rust felt like an accidental love letter.
NOW? with package managers