That's not how the arena works. The evaluation is blind so Google's advertising/integration has no effect on the results.

3 points, sure

Right, it only scores 3 points higher on image edit, which is within the margin of error. But on image generation, it scores a significant 29 points higher.