Here is my vision: the future of AI is about truly understanding the real world. The world around us.

Not everything in the real world will expose an MCP server so AI can interact with it.

Eventually, AI will need to move beyond MCP and interact with the real world the same way humans do: by observing, interpreting, reasoning, and taking action in messy, imperfect environments.

MCP tries to organize our messy word to make interaction part with the world easier in the near term, and it will help accelerate very early progress. But ultimately, MCP is a temporary bridge. Not the final destination.

I give it max 3 to 6 years and it will just die.

The problem isn’t that AI cannot do this already. Because it can. The problem is cost.

There are people complaining that MCP costs too many tokens. Parsing the same UIs that humans do would just be insane.

And to be honest, I don’t think that’s the right goal of AI anyway. Most competent engineers prefer APIs to human interfaces. So why should machine-to-machine use that too?