Marginal improvements? Were you living under a rock for the past year?

Even o1 was a major, groundbreaking upgrade over 4o. RLVR with CoT reasoning opened up an entire new dimension of performance scaling. And o1 is, in turn, already obsoleted - first by o3, and then by GPT-5.