Chrome uses tons of APIs from MacOS, and all that code is very well optimized by Apple.
I remember disassembling Apple’s memcpy function on ARM64 and being amazed at how much customization they did just for that little function to be as efficient as possible for each length of a (small) memory buffer.
memcpy (and the other string routines) are some of the library functions that most benefit from heavy optimisation and tuning for specific CPUs -- they get hit a lot, and careful adjustment of the code can get major performance wins by ensuring that the full memory bandwidth of the CPU is being used (which may involve using specific load instructions, deciding whether using the simd registers is better or not, and so on). So everybody who cares about performance optimises these routines pretty carefully, regardless of toolchain/OS. For instance the glibc versions are here:
https://github.com/bminor/glibc/tree/master/sysdeps/aarch64/...
and there are five versions specialised for either specific CPU models or for available architecture features.