Have you given GLM 5.1 or Kimi K2.6 a shot for coding? They outperform Deepseek v4 pro.

Glm5.1 is fantastic for me. But that could be how I use it, I don't ask it to build entire apps or entire features, instead asking it to build piecemeal functionality. For that it compares very well to chatgpt 5.4 (I haven't extensively tried 5.5, it might be better, might be same). I have given deepseekv4 pro a try but not much more than a try, as it performed subpar on 4 tasks in a row (missing the obvious/intended path, generating subpar slightly buggy code to make things work the not obvious way) , I gave up on it.

Glm5.1 for me was a bit of a llama3.1 moment (first open model i could chat with that was usable in manging my inputs the intended way) for code, the first open model that was actually usable.

I tried Kimi K2.6 but came away underwhelmed -- it is much more expensive / slow but does not feel better to me. Haven't tried the GLM series.

> Kimi K2.6 a shot for coding? They outperform Deepseek v4 pro

I think this probably depends quite a bit on the specific problem. I'm finding that Deepseek v4 Flash often outdoes Kimi 2.6 on a variety of coding problems that involve complex spatial reasoning

Oh that's quite interesting and hasn't been my experience with regular backend code specifically with respect to tool calling. However that could be because the tool calling format in vllm for Deepseek v4 was broken until a few days ago and that's how I'm running it.

I've been hearing amazing things about Flash, I should give it a try.