Would love to see the benchmark comparison between Mythos / Fable and GPT-5.5-Cyber
Do you mean full benchmarks? Because from the article they claim 85.6 for 5.5-Cyber vs. 83.8 for mythos on Cybergym.
Do you mean full benchmarks? Because from the article they claim 85.6 for 5.5-Cyber vs. 83.8 for mythos on Cybergym.