yeah I was surprised in they benchmarks during livestream they didn't compare to ChatGPT-4o (2025-03-26) but only older one.