Surprising to me that two memory channels are separated by as little as 256 bytes. The short distance makes it easier to find, surely?

Access optimization or interleaving at a lower level than linearly mapping DIMMs and channels. x86 cache lane size is 64 bytes, so it must be a multiple. Probably 64*2^n bytes.