The commenter "Skerit" below linked to a recent implementation of this:

https://ouro-llm.github.io/

See the left-hand side of the diagram here, which is your exact proposal:

https://ouro-llm.github.io/static/images/ouro_main.png