DeepSeek V4 Pro seems to have significantly lower overhead than GLM 5.2 for the same context size. If the two are about equally smart, that's not a very good look for GLM. E.g. the KV-cache storage for GLM at full context is significantly larger, which directly impacts the effectiveness of batching on memory-constrained hardware. Keep in mind that the existing DeepSeek Pro is a preview model, we might be about to see further iterations of it being released. Hopefully the GLM folks will pick up these techniques for GLM 6 or something, the model itself is quite nice after all. It's just noticeably harder to run on limited local platforms.