Not only that, but to me it seems that after a week the intelligence is being downscaled or routed. Maybe because of lack of capacity

You can check https://marginlab.ai/trackers/codex/

It’s pretty good at catching when performance is degraded. It was for a week or so before Fable launched for instance, probably due to a/b testing or capacity as you noted.