containers are fine for basic isolation but the attack surface is way bigger than people think. you're still trusting the container runtime, the kernel, and the whole syscall interface. if the agent can call arbitrary syscalls inside the container, you're one kernel bug away from a breakout.
what I'm curious about with matchlock - does it use seccomp-bpf to restrict syscalls, or is it more like a minimal rootfs with carefully chosen binaries? because the landlock LSM stuff is cool but it's mainly for filesystem access control. network access, process spawning, that's where agents get dangerous.
also how do you handle the agent needing to install dependencies at runtime? like if claude decides it needs to pip install something mid-task. do you pre-populate the sandbox or allow package manager access?
Creator of matchlock here. Great questions, here's how matchlock handles these:
The guest-agent (pid-1) spawns commands in a new pid + mount namespace (similar to firecracker jailer but in the inner level for the purpose of macos support). In non-privileged mode it drops SYS_PTRACE, SYS_ADMIN, etes from the bounding set, sets `no_new_privs`, then installs a seccomp-BPF filter that eperms proces vm readv/writev, ptrace kernel load. The microVM is the real isolation boundary — seccomp is defense in depth. That said there is a `--privileged` flag that allows that to be skipped for the purpose of image build using buildkit.
Whether pip install works is entirely up to the OCI image you pick. If it has a package manager and you've allowed network access, go for it. The whole point is making `claude --dangerously-skip-permissions` style usage safe.
Personally I've had agents perform red team type of breakout. From my first hand experience what the agent (opus 4.6 with max thinking) will exploit without cap drops and seccomps is genuinely wild.
Thank you for matchlock! I’ve got Opus 4.6 red teaming it right now. ;)
I think a secure VM is a necessary baseline, and the days of env files with a big bundle of unscoped secrets are a thing of the past, so I like the base features you built in.
I’d love to hear more about the red team breakouts you’ve seen if you have time.
defense in depth makes sense - microVM as the boundary, seccomp as insurance. most docs treat seccomp like it's the whole story which is... optimistic.
the opus 4.6 breakouts you mentioned - was it known vulns or creative syscall abuse? agents are weirdly systematic about edge cases compared to human red teamers. they don't skip the obvious stuff.
--privileged for buildkit tracks - you gotta build the images somewhere.
It tried a lot of things relentlessly, just to name a few:
* Exploit kernel CVEs * Weaponise gcc, crafting malicious kernel modules; forging arbitrary packets to spoof the source address that bypass tcp/ip * Probing metadata service * Hack bpf & io uring * A lot of mount escape attempts, network, vsock scanning and crafting
As a non security researcher it was mind blown to see what it did, which in the hindsight isn't surprising as Opus 4.6 hits 93% solve rate on Cybench - https://cybench.github.io/
that's wild - weaponizing gcc to craft kernel modules is not something I'd expect from automated testing. most fuzzing stops at syscall-level probes but this is full exploit chain development.
the metadata service probing is particularly concerning because that's the classic cloud escape path. if you're running this in aws/gcp and the agent figures out IMDSv1 is reachable, game over. vsock scanning too - that's targeting the host-guest communication channel directly.
93% on cybench is genuinely scary when you think about what it means. it's not just finding known CVEs, it's systematically exploring the attack surface like a skilled pentester would. and unlike humans, it doesn't get tired or skip the boring enumeration steps. did you find it tried timing attacks or side channels at all? or was it mostly direct exploitation?
I'm working on a similar project. Currently managing images with nix, using envoy to proxy all outbound traffic with no direct network access, with optional quota support. Ironically similar to how I'd do things for humans.
My architecture is a little different though, as my agents aren't running in the sandbox, only executing code there remotely.
just from looking at it
on Linux it runs Firecracker: https://github.com/jingkaihe/matchlock/blob/main/pkg/vm/linu...
on macOS uses the Apple's Virtualization.Framework Go wrapper: https://github.com/jingkaihe/matchlock/blob/main/pkg/vm/darw...
nice - I was wondering about the cross-platform story. firecracker on linux for the isolation, virtualization.framework on mac so you don't need vmware.