Hacker News

Myrmornis 2 hours ago [ - ]

Can anyone give any tips for getting something that runs fairly fast under ollama? It doesn't have to be very intelligent.

When I tried gpt-oss and qwen using ollama on an M2 Mac the main problem was that they were extremely slow. But I did have a need for a free local model.

parthsareen 2 hours ago [ - ]

How much ram are you running with? Qwen3 and gpt-oss:20b punch a good bit above their weight. Personally use it for small agents.

am17an 2 hours ago [ - ]

Use llama.cpp? I get 250 toks/sec on gpt-oss using a 4090, not sure about the mac speeds