I'm generally with you on all of these ideas.
However, Google probably won't catch up. Nvidia has been winning in spite of the fact that their hardware is general purpose rather than tuned for inference.
Rubin has architectural differences I don't understand that are supposed to make inference much cheaper and faster while still retaining those other more generic capabilities. Their next generation after that is going to do even better at being fast for inference and general purpose.
Google is betting that their TPUs won't depreciate faster than the markup they have to pay to Nvidia. I don't think they will be right.