LMStudio? No, it's the easiest way to run am LLM locally that I've seen to the point where I've stopped looking at other alternatives.
It's cross-platform (Win/Mac/Linux), detects the most appropriate GPU in your system and tells you whether the model you want to download will run within it's RAM footprint.
It lets you set up a local server that you can access through API calls as if you were remotely connected to an online service.
The tradeoff is a somewhat higher learning curve, since you need to manually browse the model library and choose the model/quantization that best fit your workflow and hardware. OTOH, it's also open-source unlike LMStudio which is proprietary.
LMStudio? No, it's the easiest way to run am LLM locally that I've seen to the point where I've stopped looking at other alternatives.
It's cross-platform (Win/Mac/Linux), detects the most appropriate GPU in your system and tells you whether the model you want to download will run within it's RAM footprint.
It lets you set up a local server that you can access through API calls as if you were remotely connected to an online service.
FWIW, Ollama already does most of this:
- Cross-platform
- Sets up a local API server
The tradeoff is a somewhat higher learning curve, since you need to manually browse the model library and choose the model/quantization that best fit your workflow and hardware. OTOH, it's also open-source unlike LMStudio which is proprietary.
I assumed from the name that it only ran llama-derived models, rather than whatever is available at huggingface. Is that not the case?
No, they have quite a broad list of models: https://ollama.com/search
[edit] Oh and apparently you can also directly run some models directly from HuggingFace: https://huggingface.co/docs/hub/ollama