The title of the post on their blog is really misleading "We have Mythos at Home: GLM 5.2 beats Claude in our Cyber Benchmarks". Mythos (or Fable) isn't even benchmarked, and there's giant caveat literally at the bottom: "We have a caveat: This is one task, one dataset, one run."
I think the post is still informative, but very a little disingenuous and clickbaity.