`llama-server` looping mitigations --repeat-penalty something greater than 1.0, set reasoning/thinking OFF explicitly, prefer a gguf with more than 4bit quant
`llama-server` looping mitigations --repeat-penalty something greater than 1.0, set reasoning/thinking OFF explicitly, prefer a gguf with more than 4bit quant