Hacker News

I think you meant to link to

Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR https://arxiv.org/abs/2509.02522

not

Winning Gold at IMO 2025 with a Model-Agnostic Verification-and-Refinement Pipeline https://arxiv.org/abs/2507.15855

We've changed the top link to that from https://arxiv.org/abs/2507.15855. Thanks!

That paper is really cool too though. I'm happy that your comment sort of records the old link, because I only saw the right paper.

Ack, thank you.