Well I don’t think so.

First, op is talking about Chrome which is not an Apple software. And I can testify that I observed the same behavior with other software which are really not optimized for macOS or even at all. Jetbrains IDEs are fast on M*.

Also, processor manufacturers are contributors of the Linux kernel and have economical interest in having Linux behave as fast as they can on their platforms if they want to sell them to datacenters.

I think it’s something else. Probably unified the memory ?

Chrome uses tons of APIs from MacOS, and all that code is very well optimized by Apple.

I remember disassembling Apple’s memcpy function on ARM64 and being amazed at how much customization they did just for that little function to be as efficient as possible for each length of a (small) memory buffer.

memcpy (and the other string routines) are some of the library functions that most benefit from heavy optimisation and tuning for specific CPUs -- they get hit a lot, and careful adjustment of the code can get major performance wins by ensuring that the full memory bandwidth of the CPU is being used (which may involve using specific load instructions, deciding whether using the simd registers is better or not, and so on). So everybody who cares about performance optimises these routines pretty carefully, regardless of toolchain/OS. For instance the glibc versions are here:

https://github.com/bminor/glibc/tree/master/sysdeps/aarch64/...

and there are five versions specialised for either specific CPU models or for available architecture features.