Hacker News

That is because they use a different tool calling format than most other models. Unsloth quants fix this in their Gemma releases.

feffe 9 hours ago [ - ]

I've never been able to fix the tool calling issues. Running unsloth versions with llama.cpp, constant issues. Have tried many forum fixes, including lots of fixed chat templates, to no avail. It's mostly the edit call that breaks, which often results in "let me just rewrite the whole file from context".

stevenhubertron 18 hours ago [ - ]

Can you say a bit more about this? The bad tool calling has made me give up on using Gemma for my Hermes and a personal recipe site. I have only downloaded from Ollama.

satvikpendem 17 hours ago [ - ]

Ollama is not recommended [0], use llama.cpp or more specifically Unsloth Studio which wraps llama.cpp and which has an API mode you can use to hook into Hermes or another agent. Unsloth make both the Studio and the quants which fix various issues with many models [1] as well as implementing new features like MTP and QAT support much sooner than other teams. In general you should read r/LocalLLaMa as it has a lot of updates regarding local models as the field moves fast.

[0] https://sleepingrobots.com/dreams/stop-using-ollama/

[1] https://github.com/unslothai/unsloth/discussions/4921