Kimi is capable model but it needs a very good harness. With a good harness it is a very capable model. But it can get into all kinds of issues (loops and such) something that frontier models do not.
As I said, you can blame the model, but it is nothing that the harness cannot take care of more deterministically.
Which harness do you recommend using then for a model like kimi 2.6, Opencode or something else?
We have our own... so I was making the comment form our own experience working with the model.
It is a lot trickier to use kimi compared to sonnet - hence why it seems that sonnet is more powerful while I think it is down to the harness.
How did you make your own harness, I am curious to know more about the building process and please feel free to share your harness.
If someone were to not use your harness and rather use some stock harness though, what is the one that you would recommend? I am curious about that too.
I really don't mean to discourage anyone but I find that making your own harness is pretty complex process. We where very naive when we started, thinking this is just a loop, but then it turned out it has all kinds of nuances, edge-cases, things to be considered, etc.
As I am writing this I am fixing a bug in the harness that could cause infinite loops under some conditions.
For simple interactions looping over the the llm complete function is not really that difficult. Put some tools, write the loop and exit.
It starts getting more tricky when you need to detect cycling behaviour, when some tools might need to abort but do so gracefully, when you need to wait for something to complete while allowing other parts to continue, when you want to protect against too much usage after you hit certain thresholds, when you need to retry whatever, when you need to compact or truncate to maintain good context window and do so with the model in mind (kimi), when messages need to be structured in certain ways to handle various model capabilities, etc.
Pi.dev, Hermes and all the other I have seen do not do even half of the things I have enumerated above - not at least out of the box.
> I really don't mean to discourage anyone but I find that making your own harness is pretty complex process. We where very naive when we started, thinking this is just a loop, but then it turned out it has all kinds of nuances, edge-cases, things to be considered, etc.
I'd be interested in your harness if its open source, please share some more resources
> Pi.dev, Hermes and all the other I have seen do not do even half of the things I have enumerated above - not at least out of the box.
Interesting that they don't do these things out of the box, do you still have some ideal setup then though? Like Pi.dev + X/Y/Z thing which can make things work close enough to ideal.
Because although our conversation is interesting, I feel like I am unable to take any enforcable action and I might default back to opencode. So I would love if you could talk more about it.
I am not saying that they do not work.
What I am saying is that they do not handle many things that we handle internally and yes our code is not open source so I have the luxury to compare notes without revealing much how we do things... sorry about this.
The reason opencode and pi.dev are not handling a lot of the edge-cases is because they are primarily designed to run in a constraint mode with some level of human intervention assumed and while you can certainly make them run in yolo mode I don't believe this is most of their usage. OpenClaw is like that but then look at the code behind it - it is enormous. Most of it is mysterious to me.
Our tool is my bio. But for open source opencode and pi.dev are the best and most widely used.
Personally I prefer my harness to be open source. So Opencode/pi.dev seem the best for me.
I suppose that as your original comment mentioned Kimi being a nice model with a good harness, My personal opinion suggests for me saying that Kimi with opencode might make it competent model too. Although currently I just use the model provided by default on opencode and I have found it to be competent for small codebases itself, although you definitely have to ask it to git/jj.
The Opencode default model consensus seems to be GLM 4.6 and Kimi 2.6 is definitely much more competent model
I think that tools like Opencode might continue getting better too given its open nature if/as the harness itself turns out to be the most valued piece for a model's competence as you suggested, so I am betting on Opencode and I really appreciated this discussion we had as It was great to know insider insights and I wish you good luck for your product!