> This difference is particularly noticeable with multiple images sharing the same base layers. With legacy storage drivers, shared base layers were stored once locally, and reused images that depended on them. With containerd, each image stores its own compressed version of shared layers, even though the uncompressed layers are still de-duplicated through snapshotters.
This seems like a really weird decision. If base images are duplicated for every image you have, that will add up quickly.
I think there is an Issue/PR right now to change this. See: https://github.com/containerd/containerd/issues/13307
Oh, very glad to see this, ML applications that were mentioned in it are exactly why I was thinking this was such a disastrous change.
However, the tedium of the reply chain reminds me why I tend to focus most energy on internal projects rather than external open source...
Docker may have been built for a specific type of use case that most developers are familiar with (e.g. web apps backed by a DB container) but containerization is useful across so much of computing that are very different. Something that seems trivial in the python/DB space, having one or two different small duplicates of OS layers, is very different once you have 30 containers for different models+code, and then ~100 more dev containers lying around as build artifacts from building and pushing, and pulling, each at ~10GB, that the inefficient new system is just painful.
The smallest PyTorch container I ever built was 1.8GB, and that was just for some CPU-only inference endpoints, and that took several hours of yak shaving to achieve, and after a month or two of development it had ballooned back to 8GB. Containers with CUDA, or using significant other AI/ML libraries, get really big. YAGNI is a great principle for your own code when writing from scratch, but YAGNI is a bit dangerous when there's been an entire ecosystem built on your product and things are getting rewritten from scratch, because the "you" is far larger than the developer making the change. Docker's core feature has always been reusable and composable layers, so seeing it abandoned seems that somebody took YAGNI far too extreme on their own corner of the computing world.
Docker(Hub) just isn't built for this use case. I've built https://clipper.dev to better handle ML/large images. It consists of a registry+pull client that breaks apart layers and does content addressing of individual chunks by _uncompressed_ hash, so that content can be better shared. My pull client has better parallelization and wastes much less bandwidth. It annoys the heck out of me when I change one file in a layer and have to redownload bytes my device already has. By sharing across layers I've seen 80-90% improvements in pull times for "patches".
I'm also in the process of building a BuildKit builder, I'm seeing large improvements on the speed of exporting images. The same image that takes Docker >3 minutes to export and push takes me under a minute. https://github.com/clipper-registry/benchmarks/actions/runs/...
> ...containerization is useful across so much of computing that are very different....
So much that containerization in general predates Linux, and UNIX, all the way back to System 360.
Also it got introduced into Tru64, HP-UX, BSD and Solaris, before landing into Linux.
This is hell for a lot of ML containers, that have gigabytes of CUDA and PyTorch. Before at least you could keep your code contained to a layer. But if I understand this correctly every code revision duplicates gigabytes of the same damn bloated crap.
It's even worse when you end up installing PyTorch as a separate package in some other layer. It's not shared between layers at all with regular Docker.
If you have problems with 13 (I believe) GB of docker layers ... how do you deal with terabytes or petabytes of AI training data?
Petabytes of training data is only one application of PyTorch, which is going to use tens of thousands of containers, but...
Inference, development cycles, any of the application domains of PyTorch that don't involve training frontier models... all of those are complicated by excessive container layers.
But mostly dev really sucks with writing out an extra 10GB for a small code change.
Going to self promote one last time here - I've built a fix for this, at least for the registry/image export side, at https://clipper.dev. Docker(Hub) can't share large files between layers, but I can.
You don't even need MB of training data for some ML applications. AI is the sexy thing nowadays, but neural networks (Torch is a NN library) are generally useful for even small regression and clarification problems.
For some problems you might even be able to get away with single digit numbers of training points (classic example of this regime being Physics-Informed Neural Networks)
Yeah, our handful of models we just commit to the git repo--usually only a few MB.
Image still ends up being like 6-8Gi tho. iirc pytorch had a hard dependency on CUDA libs which pulled in a bunch of different hardware-specific kernel binaries. The models ran on CPU and didn't even need CUDA but it was incredibly hard to remove them--there was some pytorch init code that expected the CUDA crap to exist even on CPU-only.
the training data is on a separate drive; or the training data isn't that large for this use case; or they aren't training.
You don’t train petabytes on your laptop.
Docker is already hogging a lot of disk space and needs to be pruned regularly. I can't imagine what's it's going to be like now.
"really weird decision" seems like an understatement, I thought the entire point of the specific storage design with the whole layering shebang was so things could be shared? If you remove that, just get rid of layers as a whole, what's the point otherwise?
It does. It's also very nice that this moves storage usage from /var/lib/docker over to /var/lib/containerd.
Due to that, a careless installation of a few new dev-systems under the new docker version immediately blew up storage usage on the root-disk, while happily ignoring hundreds of gigabytes on a volume on /var/lib/docker.. because that's where it needs the storage, right? A few older systems also were upgraded but didn't, which was quite confusing at first.
Sorry for being salty, but that was a pretty hectic afternoon with those new agents trashing builds, and now we have a pretty annoying migration plan to plan for the rest. And yes yes it's just a reinstallation, but we have other things to do as well.
Enterprise grade dedupe helps a lot with this. Dedupe on them has gotten very good.