Most QUIC stacks are built upon in-kernel UDP. You get significant performance benefits if you can avoid your traffic going through kernel and userspace and the context switches involved.
You can work that angle by moving networking into user space... setting up the NIC queues so that user space can access them directly, without needed to context switch into the kernel.
Or you can work the angle by moving networking into kernel space ... things like sendfile which let a tcp application instruct the kernel to send a file to the peer without needing to copy the content into userspace and then back into kernel space and finally into the device memory, if you have in-kernel TLS with sendfile then you can continue to skip copying to userspace; if you have NIC based TLS, the kernel doesn't need to read the data from the disk; if you have NIC based TLS and the disk can DMA to the NIC buffers, the data doesn't need to even hit main memory. Etc
But most QUIC stacks don't get benefit from either side of that. They're reading and writing packets via syscalls, and they're doing all the packetization in user space. No chance to sendfile and skip a context switch and skip a copy. Batching io via io_uring or similar helps with context switches, but probably doesn't prevent copies.
Yeah, there’s also a lot of offloads that can be done to the kernel with UDP (e.g. UDP segmentation offload, generic receive offload, checksum offload), and offloading quick entirely would be a natural extension to that.
It just offers people choice for the right solution at the right moment.