Why use a personal harness?

You have to pay API pricing, which is far more costly.

I'd either switch to GLM wholesale or just continue to use Opus within Claude Code as the blessed, subsidized path.

I would guess it is to avoid model lock-in.

My question is still this - why not just use GLM at that point?

The pricing of Opus outside of Claude Code is insane.

The tokens cost too much outside of Anthropic's blessed path.

I use GLM in my custom harness. It completes the same tasks at the same level of quality, except 8x faster and 8x cheaper. (Same goes for GPT!)

I'm not sure how that's possible. I expected to get increased correctness for that order of magnitude (something something test-time compute!) but I am not getting it.