Specifically including routing (i.e. which model you route to based on load/ToD)?
PS - I appreciate you coming here and commenting!
Specifically including routing (i.e. which model you route to based on load/ToD)?
PS - I appreciate you coming here and commenting!
There is no routing with API, or when you choose a specific model in chatGPT.
In the past it seemed there was routing based on context-length. So the model was always the same, but optimized for different lengths. Is this still the case?