I'm not sure I'd call it "almost on the frontier," but I do think that v4 Pro is the most usable coding model I've seen out of China. I've used it via Ollama Cloud (coding) and OpenRouter (data processing). Feels Sonnet-level to me -- solid at implementation when given a specification, but falls a good bit short of Opus 4.7 max thinking when planning out larger changes or when given open-ended prompts.

Have you given GLM 5.1 or Kimi K2.6 a shot for coding? They outperform Deepseek v4 pro.

Glm5.1 is fantastic for me. But that could be how I use it, I don't ask it to build entire apps or entire features, instead asking it to build piecemeal functionality. For that it compares very well to chatgpt 5.4 (I haven't extensively tried 5.5, it might be better, might be same). I have given deepseekv4 pro a try but not much more than a try, as it performed subpar on 4 tasks in a row (missing the obvious/intended path, generating subpar slightly buggy code to make things work the not obvious way) , I gave up on it.

Glm5.1 for me was a bit of a llama3.1 moment (first open model i could chat with that was usable in manging my inputs the intended way) for code, the first open model that was actually usable.

I tried Kimi K2.6 but came away underwhelmed -- it is much more expensive / slow but does not feel better to me. Haven't tried the GLM series.

> Kimi K2.6 a shot for coding? They outperform Deepseek v4 pro

I think this probably depends quite a bit on the specific problem. I'm finding that Deepseek v4 Flash often outdoes Kimi 2.6 on a variety of coding problems that involve complex spatial reasoning

Oh that's quite interesting and hasn't been my experience with regular backend code specifically with respect to tool calling. However that could be because the tool calling format in vllm for Deepseek v4 was broken until a few days ago and that's how I'm running it.

I've been hearing amazing things about Flash, I should give it a try.

Keep in mind that DeepSeek has a max thinking mode of its own in the API.