> It's faster of you use the CPU
But not for AMD? E.g. 8 Zen 5 cores in the CCD have only 64 GB/s read and 32 GB/s write bandwidth, while the dual-channel memory controller in the IOD has up to 87 GB/s bandwidth.
> It's faster of you use the CPU
But not for AMD? E.g. 8 Zen 5 cores in the CCD have only 64 GB/s read and 32 GB/s write bandwidth, while the dual-channel memory controller in the IOD has up to 87 GB/s bandwidth.
The issue is that a DMA setup:
A: requires the DMA system to know about each user process memory mappings (ie hardware support understanding CPU pagetables)
B: spend time going from user-kernelmode and back (we invented the entire io_uring and other mechanisms to avoid that).
To some extent I guess the IOMMU's available to modern graphics cards solve it partially but I'm not sure that it's a free lunch (ie it might be partially in driver/OS level to manage mappings for this).