They are using OCI artifacts to package models, so you can use your own registry to host these models internally. However, I just can't see any improvement comparing with a simple FTP server. I don't think the LLM models can adopt hierarchical structures like Docker, and thus cannot leverage the benefits of layered file systems, such as caching and reuse.
It's not the only one using OCI to package models. There's a CNCF project called KitOps (https://kitops.org) that has been around for quite a bit longer. It solves some of the limitations that using Docker has, one of those being that you don't have to pull the entire project when you want to work on it. Instead, you can pull just the data set, tuning, model, etc.
They imply it should be somehow optimized for apple silicon, but, yeah, I don't understand what this is. If docker can use GPU, well, it should be able to use GPU in any container that makes use of it properly. If (say) ollama as an app doesn't use it properly, but they figured a way to do it better, it would make more sense to fix ollama. I have no idea why this should be a different app than, well, the very docker daemon itself.
All that work (AGX acceleration...) is done in llama.cpp, not ollama. Ollama's raison d'être is a docker-style frontend to llama.cpp, so it makes sense that Docker would encroach from that angle.
Hear me out here ... it's like docker, but with Ai <pause for gasps and applause>.
Seems fair to raise 1bn at a valuation of 100bn. (Might roll the funds over into pitching Kubernetes, but with Ai next month)
What they really need is a Studio Ghibli'd version of their logo
They are using OCI artifacts to package models, so you can use your own registry to host these models internally. However, I just can't see any improvement comparing with a simple FTP server. I don't think the LLM models can adopt hierarchical structures like Docker, and thus cannot leverage the benefits of layered file systems, such as caching and reuse.
I think ollama uses OCI too? At least it's trying to. https://github.com/ollama/ollama/issues/914#issuecomment-195...
Yes, ollama also uses OCI, but currently only works with unauthenticated registries.
It's not the only one using OCI to package models. There's a CNCF project called KitOps (https://kitops.org) that has been around for quite a bit longer. It solves some of the limitations that using Docker has, one of those being that you don't have to pull the entire project when you want to work on it. Instead, you can pull just the data set, tuning, model, etc.
They imply it should be somehow optimized for apple silicon, but, yeah, I don't understand what this is. If docker can use GPU, well, it should be able to use GPU in any container that makes use of it properly. If (say) ollama as an app doesn't use it properly, but they figured a way to do it better, it would make more sense to fix ollama. I have no idea why this should be a different app than, well, the very docker daemon itself.
All that work (AGX acceleration...) is done in llama.cpp, not ollama. Ollama's raison d'être is a docker-style frontend to llama.cpp, so it makes sense that Docker would encroach from that angle.
Aren't some of the ollama guys ex-Docker guys?
yes