Hacker News

The blog talks about the training process. Specifically we trained with RL post-training on coding examples.

Makes sense, but what model was used for the base? Is it some open-source model, and you're not at liberty to disclose?

not a Cursor employee but still a researcher, it’s Zhipu/Z.ai GLM-4.6/4.5. there’s traces of Chinese in the reasoning output + its the only model that would make sense to do this with RL, and is a model that already delivers near SOTA performance + is open-source/open-weight.

Cursor Composer and Windsurf SWE 1.5 are both finetuned versions of GLM.

chaidhat a day ago [ - ]

interesting, thank you

chaidhat 4 days ago [ - ]

that's cool thanks!