they dont change the model weights (no frontier lab does). if you have evals and all prompts, tool calls the same, I'm curious how you are saying performance decreased..