If you're running one of the popular coding agents, they can run commands in bash which is more or less access to the infinite space of tooling I myself use to do my job.
I even use it to troubleshoot issues with my linux laptop that in the past I would totally have done myself, but can't be bothered. Which led to the most relatable AI moment I have encountered: "This is frustrating" - Claude Code thought, after 6 tries in a row to get my bluetooth headset working.
Even with all of the CLI tools at its disposal (e.g. sed), it doesn’t consistently use them to make updates as it could (e.g. widespread text replacement). Once in a blue moon, an LLM will choose some tool and use it in a way that they almost never do in a really smart way to handle a problem. Most of the time it seems optimized for using too many individual things, probably both for safety and because it makes the AI companies more money.
It's because the broader the set of "tools" the worse the model gets at utilizing them effectively. By constraining the use you ensure a much higher % of correct usage.
There is a tradeoff between quantity of tools and the ability of the model to make effective use of them. If tools in an MCP are defined at a very granular level (i.e. single API calls) it's a bad MCP.
I imagine you run into something similar with bash - while bash is a single "tool" for an agent, a similar decision still need to be made about the many CLI tools that are available from enabling bash.
I've never seen an LLM do anything but absolutely destroy linux. So much of their data is outdated solutions.
That the best thing about Linux(et al), it's a fairly stable target and programs and tools are pretty much the same as they were year on year. I wouldn't get it to help me with Nix, or let it loose on an EC2 instance, but doe general troubleshooting of Arch or something it's fine.
Edge cases are everywhere, obviously, but I don't let it run wild. I approve every command it runs.
same, this is 100x worse than just copy pasting commands from stack overflow.