This aligns very closely with my experience.

When left to its own devices, GLM-4.7 frequently tries to build the world. It's also less capable at figuring out stumbling blocks on its own without spiralling.

For small, well-defined tasks, it's broadly comparable to Sonnet.

Given how incredibly cheap it is, it's useful even as a secondary model.