We trained a model to select which LLM to call at any given turn, based on lots of agent traces