There are many cheap, open models available on the vLLM engine: https://huggingface.co/models?other=vllm. This includes gpt-oss, LLaMa, and Gemma. This is in addition to Qwen, Deepseek, Mistral, Kimi, GLM, and Poolside.
There are many cheap, open models available on the vLLM engine: https://huggingface.co/models?other=vllm. This includes gpt-oss, LLaMa, and Gemma. This is in addition to Qwen, Deepseek, Mistral, Kimi, GLM, and Poolside.
Yes, and I keep copies of the ones I like[0]. I can't run the huge ones, but the ones I can run aren't as good the "frontier" models. Regardless, I expect they will be considered contraband someday.
[0] - I've been using llama.ccp and Ollama. I should checkout vLLM.