Since previous generations of models get aggressively retired the cost reduction essentially never gets passed down to the customer.

A certain amount of input and output tokens doesn't cost 10x less than before.