It's not just the weights. It is the system prompt, harness, safety filters, etc. Those can affect performance of the same underlying model significantly.