Custom tool calling formats are iffy in my experience. The models are all reinforcement learned to follow specific ones, so it’s always a battle and feels to me like using the tool wrong.
Have you had good results with the other frontier models?
Custom tool calling formats are iffy in my experience. The models are all reinforcement learned to follow specific ones, so it’s always a battle and feels to me like using the tool wrong.
Have you had good results with the other frontier models?
Not the parent commenter, but in my testing, all recent Claudes (4.5 onward) and the Gemini 3 series have been pretty much flawless in custom tool call formats.
Thanks.
I’ve tested local models from Qwen, GLM, and Devstral families.