I think "jailbreaking" fable to match opus 4.8 capabilities is not noteworthy. Fable from my experience is not as eager to find vulnerabilities compared to what they describe in their mythos research.
I think "jailbreaking" fable to match opus 4.8 capabilities is not noteworthy. Fable from my experience is not as eager to find vulnerabilities compared to what they describe in their mythos research.
Wonder if context size would matter. Find and fix “bugs” in Linux kernel or find and fix “bugs” in this short snippet of code. I would try a file by file approach first.
I don’t know how much we want to believe the “reports”. But there are probably a few other tricks they didn’t expose. If these are pre/post processing guardrails I could see something like “fix bugs” actually working.