This seems huge for subscription customers. Looking at the Artificial Analysis numbers, 5.5 at medium effort yields roughly the intelligence as 5.4 (xhigh) while using less than a fifth the tokens.

As long as tokens count roughly equally towards subscription plan usage between 5.5 & 5.4, you can look at this as effectively a 5x increase in usage limits.

As someone who always leaves intelligence at default, and am ok with existing models, should I be shifting gears more manually as providers sell us newer models? Is medium or lower better than free/cheaper models?

SOTA models on medium are probably still better than free or cheap models, but you should experiment.