If Claude Mythos and Fable 5 are the same underlying models just with different safeguards, I fail to see how TerminalBench has them at different scores.
If Claude Mythos and Fable 5 are the same underlying models just with different safeguards, I fail to see how TerminalBench has them at different scores.
Refusals, presumably.