> so they can't do anything useful
I never claimed that. They demonstrate just how powerful Markov chains can be with sophisticated state representations. Obviously LLMs are useful, I have never claimed otherwise.
Additionally, it doesn’t require any logical leaps to understand decoder only LLMs as Markov Chains, they preserve the Markov Property and otherwise be have exactly like them. It’s worth noting that encoder-decoder LLMs do not preserve the Markov property and can not be considered Markov chains.
Edit: I saw that post and at the time was disappointed by how confused the author was about those topics and how they apply to the subject.