Hacker News

ghxst 3 days ago [ - ]

What editor do you use, and how did you set it up? I've been thinking about trying this with some local models and also with super low-latency ones like Gemini 2.5 Flash Lite. Would love to read more about this.

mathiaspoint 3 days ago [ - ]

Neovim with the llama.cpp plugin and heavily quantized qwen2.5-coder with 500 (600?) million parameters. It's almost plug and play although the default ring context limit is way too large if you don't have a GPU.