Composer 2 performed differently on evals than Moonshot.ai's coding models: Cursor claims theirs is better than Claude Opus 4.6: https://x.com/fynnso/status/2034706304875602030 / https://archive.vn/bVtik. And, per Lee Robinson (Cursor employee), it is very likely Cursor builds its own foundational model for Composer 3.
Kimi works great in their CLI, but their CLI has a number of workarounds for quirks of their models, including detecting when the model gets into a loop, and reverting to a checkpoint but letting the model compose a "message" to its past self (search their CLI for "BackToTheFuture"...) It doesn't work so well in a harness that doesn't take those quirks into account.
Shaming others when all AI is trained off scraped content and code huh? Many of those sources either breaking ToS or being illegal, such as Anna’s Archive. Bold move. And Chinese models in particular have been accused of distilling off American models.
Cursor had a specific licensing agreement that allowed them to brand it how they want.
> Cursor had a specific licensing agreement...
Cursor had an "agreement" with Fireworks.ai, which apparently allowed them to RL Composer 2 atop Kimi Base 2.5 without attribution: https://x.com/Kimi_Moonshot/status/2035074972943831491 / https://archive.vn/CcdkI
Composer 2 performed differently on evals than Moonshot.ai's coding models: Cursor claims theirs is better than Claude Opus 4.6: https://x.com/fynnso/status/2034706304875602030 / https://archive.vn/bVtik. And, per Lee Robinson (Cursor employee), it is very likely Cursor builds its own foundational model for Composer 3.
Wasn't the end of that story that Cursor had a non-disclosure licence, so they had not done anything wrong towards Moonshot?
Moonshot licenced it to Fireworks AI who licenced it to Cursor.
Ah is that what it is? I don't use Cursor, never saw it as being relevant to me, but would not surprise me.
Cursor's composer models are finetuned kimi
They are unusable (unless you want to deliberately destroy your codebase). So if Cursor's models are Kimi based, then well. I'll skip them altogether.
Kimi works great in their CLI, but their CLI has a number of workarounds for quirks of their models, including detecting when the model gets into a loop, and reverting to a checkpoint but letting the model compose a "message" to its past self (search their CLI for "BackToTheFuture"...) It doesn't work so well in a harness that doesn't take those quirks into account.
I'm using Composer extensively, and it works great for me. Your experiences are not universal.
I wouldn't skip at least testing the original. Model distilling done by Cursor could be the culprit.
They are far from unusable. They aork great for 80-90% of a typical full stack dev. Alot less useful for more noche stuff
Composer 1.x was poor. The new one is a totally different beast and absolutely fine for day to day.
I only use composer 2.5 day to day and it works fine with human review.
They're not unusable, they're just bad when compared with all the real frontier models.
Shaming others when all AI is trained off scraped content and code huh? Many of those sources either breaking ToS or being illegal, such as Anna’s Archive. Bold move. And Chinese models in particular have been accused of distilling off American models.
Don’t you know there’s no honor among thieves?