Intelligence per token doesn't seem quite right to me.

Intelligence per <consumable> feels closer. Per dollar, or per second, or per watt.

It is possible to think of tokens as some proxy for thinking space. At least reasoning tokens work like this.

Dollar/watt are not public and time has confounders like hardware.