Unfortunate how many of the 'non mainstream' models are poor at function handling. I'm trying K2 out via Novia AI and it consistently fails to format function calls, breaking the reasoning flow.
Unfortunate how many of the 'non mainstream' models are poor at function handling. I'm trying K2 out via Novia AI and it consistently fails to format function calls, breaking the reasoning flow.
This is most likely issue on the side of the inference provider: https://github.com/MoonshotAI/K2-Vendor-Verifier
For example, Together AI has only 71% success rate, while the official API has 100% success rate.