With Reinforcement Learning, inference is very present in post-training stages now too