Hacker News

Specifically including routing (i.e. which model you route to based on load/ToD)?

PS - I appreciate you coming here and commenting!

There is no routing with API, or when you choose a specific model in chatGPT.

In the past it seemed there was routing based on context-length. So the model was always the same, but optimized for different lengths. Is this still the case?

2 months ago [ - ]

[deleted]