Hundreds of cores is likely two sockets and so you've got NUMA there.

Scaling to large core counts has a lot of gotchas.