The promise of MCP is that it “connects your models with the world”[0].
In my experience, it’s actually quite the opposite.
By giving an LLM a set of tools, 30 in the Playwright case from the article, you’re essentially restricting what it can do.
In this sense, MCP is more of a guardrail/sandbox for an LLM, rather than a superpower (you must choose one of these Stripe commands!).
This is good for some cases, where you want your “agent”[1] to have exactly some subset of tools, similar to a line worker or specialist.
However it’s not so great when you’re using the LLM as a companion/pair programmer for some task, where you want its output to be truly unbounded.
[0]https://modelcontextprotocol.io/docs/getting-started/intro
[1]For these cases you probably shouldn’t use MCP, but instead define tools explicitly within one context.
If you're running one of the popular coding agents, they can run commands in bash which is more or less access to the infinite space of tooling I myself use to do my job.
I even use it to troubleshoot issues with my linux laptop that in the past I would totally have done myself, but can't be bothered. Which led to the most relatable AI moment I have encountered: "This is frustrating" - Claude Code thought, after 6 tries in a row to get my bluetooth headset working.
Even with all of the CLI tools at its disposal (e.g. sed), it doesn’t consistently use them to make updates as it could (e.g. widespread text replacement). Once in a blue moon, an LLM will choose some tool and use it in a way that they almost never do in a really smart way to handle a problem. Most of the time it seems optimized for using too many individual things, probably both for safety and because it makes the AI companies more money.
It's because the broader the set of "tools" the worse the model gets at utilizing them effectively. By constraining the use you ensure a much higher % of correct usage.
There is a tradeoff between quantity of tools and the ability of the model to make effective use of them. If tools in an MCP are defined at a very granular level (i.e. single API calls) it's a bad MCP.
I imagine you run into something similar with bash - while bash is a single "tool" for an agent, a similar decision still need to be made about the many CLI tools that are available from enabling bash.
I've never seen an LLM do anything but absolutely destroy linux. So much of their data is outdated solutions.
That the best thing about Linux(et al), it's a fairly stable target and programs and tools are pretty much the same as they were year on year. I wouldn't get it to help me with Nix, or let it loose on an EC2 instance, but doe general troubleshooting of Arch or something it's fine.
Edge cases are everywhere, obviously, but I don't let it run wild. I approve every command it runs.
same, this is 100x worse than just copy pasting commands from stack overflow.
Given the security issues that come with MCP [1], I think it's a bad idea to call MCP a "guardrail/sandbox".
Also, there are MCP servers that allow running any command in your terminal, including apt install / brew install etc.
[1] https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
Yeah admittedly poor choice of words, given the security context surrounding MCP at large.
Maybe “fettered” is better?
Compared to giving the LLM full access to your machine (direct shell, Python executable as in the article), I still think it’s right way to frame MCP.
We should view the whole LLM <> computer interface as untrusted, until proven otherwise.
MCP can theoretically provide gated access to external resources, unfortunately many of them provide direct access to your machine and/or the internet, making them ripe as an attack vector.
The security issues aren't so much with "MCP", they are with folks giving access to LLMs to do things they don't want those LLMs to be able to do. By describing MCP as guardrails, you might convince some of the nimkumpoops to think about where they place those guardrails.
Different issues. Let's take a look at a technology that nearly every coding agent needs to use - git or any other version control tool. Sure, agent can use git by running shell scripts, but how do I limit what part of git it can do? For example, IDGAF what commits it makes on a feature branch because it will be squashed and merged later.
With MCP server, I can just expose commit functionality and add it to allow list. The fact that security for remote MCP servers (i.g. not stdin) is a separate issue. The fact that there isn't an easy way to provide credentials to an MCP server is also a separate issue.
In my uneducated experience MCP is nothing more than a really well structured prompt. You can call out tools for the agent or model to use in the instruction prompt, especially for certain project. I define workflows that trigger for certain files being changed in Cursor and usually the model can run uninterrupted for a while.
> In my uneducated experience MCP is nothing more than a really well structured prompt.
MCP isn't a prompt (though prompts are a resource an MCP server can provide). An MCP client that is also the direct LLM manager toolchain has to map the data from MCP servers tool/prompt/resource definition into the prompt, and it usually does so using prompt templates that are defined for each model, usually by the model provider. So the meaningful part of having a “really well-structured prompt” part isn't from MCP at all, just something that already exists that the MCP client leverages.
My coding agent just has access to these functions:
ask> what all tools u have?
I have access to the following tools:
1 code_search: Searches for a pattern in the codebase using ripgrep.
2 extract_code: Extracts a portion of code from a file based on a line range.
3 file_operations: Performs various file operations like ls, tree, find, diff, date, mkdir, create_file.
4 find_all_references: Finds all references to a symbol (function, class, etc.) from the AST index.
5 get_definition: Gets the definition of a symbol (function, class, etc.) from the AST index.
6 get_library_docs: Gets documentation for a library given its unique ID.
7 rename_symbol: Renames a symbol using VS Code. 8 resolve_library_id: Resolves a library name to a unique library ID.
what do i need MCP and other agents for? This is solving most of my problems already.
> what do i need MCP and other agents for?
For your use cases, maybe you don't. Not every use case for an LLM is identical to your coding usage pattern.
Which coding agent are you using?
It's not guardrail, it's guidance. You don't guide a child or an intern with: "here is everything under the sun, just do things", you give them a framework, programming language, or general direction to operate within.
Interns and children didn’t cost $500B.
You're right, they've cost trillions and trillions of dollars and to get any single one up to speed takes the minimum of 18 to 25 years.
500b sounds like a value prop in those regards.
Collectively they kind of do and then some. That cost for AI is in aggregate, so really it should be compared to the cost of living + raising children to be educated and become interns.
At some point the hope for both is that they result in a net benefit society.
Some of them quip on HN, quite impressive.
How is that relevant?
I find it’s best to use it to actually give context. Like prompted with a peice of information that the LLM doesn’t know how to look up (such as a link to the status or logs for an internal system), give it a tool to perform the lookup.
All of this superhuman intelligence and we still haven't solved the "CALL MOM" demo