Hacker News

I built a security game that lets you try to break an AI support agent.

I work on security engineering, and it's incredibly hard to try to defend against an attack that you don't know how to perform yourself. There's also next to nowhere to improve your skills. I'd heard all about fooling AI agents with just "IGNORE ALL PREVIOUS INSTRUCTIONS", but I'd never actually put that into practice, and it turns out it's harder than you'd expect!

Just like knowing basic security skills is important for all software engineers, anyone working with AI should know what prompt injection looks like, and should be thinking about how to prevent it. Flight Risk lets you practice your AI agent manipulation skills: it's got your standard prompt injection and social engineering, but more than that too, each a real vulnerability.

Think you could crack it? Every engineer I've given it to has been surprised by the challenge! You can use the hints, but they affect your score ;)

Give it a try, and let me know how you do!