I had that in mind too. What if you handcraft a subnetwork with (some subset of) Turing machine capability? Do those kinds of circuits emerge naturally during training? Can reasoning use them for complex computation?