There is a footnote that should help with the models. Training is a harder thing to report on, but roughly our finding here is that RL scales.