Kimi and GLM models have coined a new term: Thinkslop. They run a chain of thought that is up to 10x longer than other models and it seems that through a lookback mechanism they are able to use the CoT to reason about solutions to tasks they couldn't otherwise solve.

The downside is of course that they consume many more tokens off your plan, and also that they are significantly slower. Kimi K2.7 takes about 7x longer to finish the same benchmark tasks as DeepSeek V4 Pro on my router benchmarks (https://role-model.dev/).

So for now I'm happy with just two models: GPT and DeepSeek.

> Kimi and GLM models have coined a new term: Thinkslop. > [...] > So for now I'm happy with just two models: GPT and DeepSeek.

1. DeepSeek V3.2, V4 Flash, V4 Pro, at high or max thinking, ... when recommending a model it should always be a precise model, not just an AI lab

2. DeepSeek V4 Flash at max thinking is the most verbose model (among top models) in the AA benchmarks. See the "Intelligence Index Token Use" chart: [1]

[1]: https://artificialanalysis.ai/models?models=gpt-5-5-high%2Cg...

I said specifically V4 Pro. Flash is not the most verbose, that's more likely to be Kimi.

yeah Kimi K2.7 was doing ok but was painfully slow. The coding plan limits were good though.

I haven't tried deepseek yet, i should check this one out.

After the release of K2.7, the Kimi plan quotas have been reduced by about 80%.

Turning up the thinking (max time spent thinking) lever really changes model performance, even for tiny models. But it's really irritating because it adds a lot of time.