This might be orthogonal to the TLB miss overhead you found, but have you looked at using P2PDMA to transfer directly from the NVMe SSDs to the NIC? Not sure how the CRC calculation would play into that.
This might be orthogonal to the TLB miss overhead you found, but have you looked at using P2PDMA to transfer directly from the NVMe SSDs to the NIC? Not sure how the CRC calculation would play into that.