Ive been working on Peen, a CLI that lets local Ollama models call tools effectively. It’s quite amateur, but I’ve been surprised how spending a few hours on prompting, and code to handle responses, can improve the outputs of small local models.

https://github.com/codazoda/peen

Current LLMs use special tokens for tool calls and are thoroughly trained for that, nearing almost 100% correctness these days, allowing multiple tool calls per single LLM response. That's hard to beat with custom tool calls. Even older 80B models struggle with custom tools.

[deleted]

Very cool. Love to see more being squeezed from smaller models.