Good catch, that's a legit bypass
nah strips env var prefixes before classifying the command but doesn't inspect their values for embedded shell execution, I'll fix it: https://github.com/manuelschipper/nah/issues/6
On the broader write-then-execute point - two improvements are coming:
- Script execution inspection: when nah sees python script.py, read the file and run content inspection and LLM analysis before execution
- LLM inspection for Write/Edit: for content that's suspicious but doesn't match any deterministic pattern, route it to the LLM for a second opinion
Won't close it 100% - to your point a sandbox is the answer to that.
I don't think "security tool" and "not a sandbox" are contradictory though. Firewalls don't replace OS permissions, OS permissions don't replace encryption
nah is just another layer that catches the 95% that's structurally classifiable. It's a different threat model. If 200 IQ Opus is rogue deterministic tools or even adversarial one shot LLMs won't be able to do much to stop it...
> Firewalls don't replace OS permissions, OS permissions don't replace encryption
Of course but the crucial difference is that these operate using an allow list, not a block list.
If I extend the analogy, if my OS required me to block-list every user who shouldn't have access to my files then I wouldn't trust that mechanism to provide a security barrier. If my firewall worked in such a manner that it allowed all traffic by default and I had to manually block every attacker on the public internet then I wouldn't rely on it either.
My own analogy is that this it a bit like saying that you want a relatively safe car and then buying one without any airbags or seatbelts, and thinking it's fine because it has lane departure warnings and automatic braking. I've got nothing against you personally, I just find this sort of viewpoint extremely puzzling (and oddly common). I make the same criticism when people just disable post-install scripts instead of using a sandbox.
allowlists are stronger than blocklists - that's not debatable and right there with you
but nah isn't a pure blocklist - anything that doesn't match a known pattern classifies as unknown which defaults to ask (user gets prompted). It's not "allow all traffic, block each attacker" it's allow known-safe, block known-dangerous, prompt for everything else.
the analogy doesn't carry that far... it's a different threat model: nah isn't containing rogue agents or adversarial actors, it's a guardrail for a trusted but mistake-prone agent.
maybe more akin to a junior employee accidentally dropping the database cause they didn't know better. but how are they supposed to work on prod? They ask "boss, can I run this? SELECT customer, sales FROM SALES.PROD..." You say: cool, You don't have to ask me again for SELECT (nah allow db_read).
But then they can ask- "can I run this? drop SALES.PROD?".... hmmm, nah.