If you're able to do it at the memory controller level, would it be as simple as making two controllers always operate in lock-step, so their refresh cycles are guaranteed to be offset 50% from one another?
Given that the controller can already defer refresh cycles, and the logic to determine when that happens sounds fairly complex, I suspect that might already be in CPU microcode.
...which raises the tantalizing possibility that this lockstep-mirrored behavior might also be doable in microcode.