> Slash commands, for instance, are a misfeature. I should never have to wait for the chatbot finish a turn so that I can check on the status of my context window or how much money I've spent this session. Control should be orthogonal to the chat loop.
I get what you're trying to say but in practice architecting what you propose is considerably more difficult. Why not build it and try to get hired by one of the bigcos?
I don't think the basic architecture principles are novel. The big AI labs and other large tech companies already have engineers who can see this, without a doubt. But the AI labs clearly don't care if their LLM agents are just big balls of mud, and the big tech companies priorities mostly lie elsewhere, too.
They just want features. They don't really care about duplicated work, so half of them reinvent the TUI rendering wheel. Pluggability is something that might be actually hostile to their interests in lock-in. And the AI labs probably think "after a couple more scaling cycles, our models will be so good that our agents can just rewrite themselves from scratch"; until they hit a compute or power wall, it always looks rational to them to defer rearchitecting.
Another real possibility is that if you work on an agent with a really clean architecture and publish it in hopes of getting hired by some AI company, all of them think "that looks great, but we don't want to rearchitect right now". Your code winds up in the training set, and a year and a half from now, existing agents can "one-shot" rewrites along the lines of your design because they're "smarter".
As for me, I'm not that interested, personally. There are other things I want to build and I'm working on those.
In what way would it be more complicated? This is pretty basic concurrent programming, we routinely have much much more complex concurrent designs..
Hell, a telegram bot can handle that just fine.