I absolutely love Kimi's personality - some of the things it says are so out there! And it's been great for very focused, iterative work.
Its weakness is that it seems to yak on-and-on when it needs to plan out something big or read through and make sense of how to use a niche piece of a complex library. To the point where it can fill up its 256k window - and rack up a build. (No cache.) I have had better experience with GLM 5.1 in those cases.
Anyone out there relate?
Absolutely. I use caveman to help with that: https://github.com/JuliusBrussee/caveman
You can just add "be brief" to the prompt to replace the entire plugin. Same results.
https://www.maxtaylor.me/articles/i-benchmarked-caveman-agai...
Not a bad idea - however
> Caveman only affects output tokens — thinking/reasoning tokens are untouched.
The problem is the thinking. But could help to tune my system prompt for Kimi.