In my experience these models (glm 5.1) struggle after 100K tokens.

GLM-5.1 had a coherency bug at launch, it might be worth retrying it if you haven't in a while. It can now use the full 256k context as intended.

Interesting, will give it a try again, thanks.