Anyone from the Cursor world already YOLO's it by default.
A massive productivity boost I get is using to do server maintenance.
Using gcloud compute ssh, log into all gh runners and run docker system prune, in parellel for speed and give me a summary report of the disk usage after.
This is an undocumented and underused feature of basic agentic abilities. It doesn't have to JUST write code.
Yesterday I was trying to move a backend system to a new AWS account and it wasn’t working. I asked Claude Code to figure it out. About 15 minutes and 40 aws CLI commands later, it did! Turned out the API Gateway’s VPCLink needed a security group added, because the old account’s VPC had a default egress rule and the new one’s didn’t.
I barely understand what I just said, and I’m sure it would have taken me a whole day to track this down myself.
Obviously I did NOT turn on auto-approve for the aws command during this process! But now I’m making a restricted role for CC to use in this situation, because I feel like I’ll certainly be doing something like this again. It’s like the AWS Q button, except it actually works.
> I barely understand what I just said, and I’m sure it would have taken me a whole day to track this down myself.
This is what the future of IT work looks like
But it’s scalable! (And has electrolytes)
It's got what code craves.
As long as $ millions will keep flowing to owners and CEOs no one would see a slightest issue with that.
Better yet would be to have it codify the config using IAC for reproducibility.
That’s what I was doing! The app config is done with IAC — and the new account is too — but that old account wasn’t.
This kind of trial and error debugging is the main reason I pay for Calude Code. Software development and writing code is meh. I mean, it’s okay. But I have a strong opinion on most coding tasks. But debugging something I touch once in a blue moon, trying out 10 commands before I find the failure point - that’s just something else.
Yeah totally setting it lose on your home lab is quite an eye opener. It’ll tip up recovery scripts diagnose hung systems figure out root causes of driver bugs; Linux has never been so user friendly!
Bonus points I finally have permissions sorted out on my samba share haha …
(For years I was befuddled by samba config)
100% agree
I've used Linux as my daily driver for well over a decade now, but there were quite a few times where I almost gave up.
I knew I could always fix any problem if I was willing to devote the time, but that isn't a trivial investment!
Now with these AI tools, they can diagnose, explain, and fix issues in minutes. My system is more customized than ever before, and I'm not afraid to try out new tools.
True for more than just Linux too. It's a godsend for homelab stuff.
I don't quite go this far, but I do use Claude/Codex to write Ansible playbooks/roles/collections that then do this kind of thing.
It is very easy to see what actions are being taken from the code produced, and then one gets a tool that can be used over and over again.
You can then also put these into mise tasks, because mise is great too.
There are a million different tools that are designed to do this, e.g. this task (log into a bunch of machines and execute a specific command without any additional tools running on each node) is literally the design use case for Ansible. It would be a simple playbook, why are you bringing AI into this at all?
Agreed, this is truly bizarre to me. Is OP not going to have to do this work all over again in x days time once the nodes fill with stale docker assets again?
AI can still be helpful here if new to scheduling a simple shell command, but I'd be asking the AI how do I automate the task away, not manually asking the AI to do the thing every time, or using my runners in a fashion that means I don't have to even concern myself with scheduled prune command calls.
No, we have a team dedicated to fixing this long term, but this allowed 20 engineers to get working right away. Long term fix is now in.
If a team of 20 engineers got blocked because you/the team didn't run docker prune, you arguably have even bigger problems...
> but I'd be asking the AI how do I automate the task away
AI said “I got this” :)
Yeah that sounds like a CI/CD task or scheduled job. I would not want the AI to "rewrite" the scripts before running them. I can't really think of why I would want it to?
Because I didn't have to do anything other than write that english statement and it worked. Saved me a long time.
I'm glad this worked for you, but if it were me at most I would have asked Claude Code to write me an Ansible playbook for doing this, then run it myself. That gives me more flexibility to run this in the future, to change the commands, to try it, see that it fails, and do it again, etc.
And I honestly am a little concerned about a private key for a major cloud account where Claude can use it, just because I'm more than a little paranoid about certs.
You're right to be concerned. OPs method is how you get pwned through prompt injection.
Relevant: https://steipete.me/posts/2025/claude-code-is-my-computer
Is this what ansible does? Or some other classic ops tool?
Does Cursor have a good sandboxing story?
There's no sandbox of any kind, at least on Linux, and the permission system is self-defeating. The agent will ask to run something like `bash -c "npm test"` and ask you to whitelist "bash" for future use. I don't use it daily because I don't find it useful to begin with, but when I take it for a spin it's always inside a full VM.
I run multiple instances of cursor cli yolo in a 4 x 3 tmux grid each in an isolated docker container. That is a pretty effective setup.