Is that true? Aren’t lots of the tcp optimisations about offloading work to the hardware, eg segmentation or tls offload? The hardware would need to know about your tcp-in-udp protocol to be able to handle that efficiently.

Most hardware is fairly generic for tunneled protocols, and tx descriptors can take things like "inner l4 header offset/len" and "outer l4 header offset/len"

Generic support for tunneled TCP is far more doable than support for a new and volatile protocol.