Hey simonw I love your test, do you think using thinking level "max" makes sense for this test? I would love to see the results about it.

I don't think the API supports "max" as an option, that might just be a Claude Code harness thing.

UPDATE: My mistake, the API does support max. I added a max one at the bottom of this page (cost 43 cents): https://tools.simonwillison.net/markdown-svg-renderer#url=ht...