> a single core in a desktop CPU can easily saturate the bandwidth of the system RAM controller.
Modern x86 machines offer far more memory bandwidth than what a single core can consume. The entire architecture is designed on purpose to ensure this.
The interesting thing to note is that this has not always been the case. The 2010s is when the transition occurred.
Some modern non-x86 machines (and maybe even some very recent x86 ones) can't even saturate their system memory bandwidth with all of their CPU cores running at full tilt, they'd need to combine both CPU and non-CPU access for absolute best performance.