This is super cool! We use a similar approach for CheepCode: our agent process connects to an MCP server that then "drives" the rest of the interaction.
This paradigm feels like the obvious next step for agents. It more closely models human interaction (to the degree that this is desirable) and unlocks a lot of optimizations + powerful functionality.
It is going to be an exciting rest of the year!