Jeez, that arxiv paper invalidates my assumption that it can't model the game. Great read. Thank you for sharing.

Insane that the model actually does seem to internalize a representation of the state of the board -- rather than just hitting training data with similar move sequences.

...Makes me wish I could get back into a research lab. Been a while since I've stuck to reading a whole paper out of legitimate interest.

(Edit) At the same time, it's still worth noting the accuracy errors and the potential for illegal moves. That's still enough to prevent LLMs from being applied to problem domains with severe consequences, like banking, security, medicine, law, etc.