If those kinds of optimizations are on the table, why would they not already be compiling and optimizing from source?

I'm not an hyperscaler, I run a thousand machines. If by just changing the base image I use to build - in an already automated process - those machines, well, the optimization is basically for free. Well, unless it triggers some new bug that was not there before.