Hacker News

evilduck 3 days ago [ - ]

With a 16GB GPU you can comfortably run like Qwen3 14B or Mistral Small 24B models at Q4 to Q6 and still have plenty of context space and get much better abilities than an 8B model.