> QUIC is meant to be fast, but the benchmark results included with the patch series do not show the proposed in-kernel implementation living up to that. A comparison of in-kernel QUIC with in-kernel TLS shows the latter achieving nearly three times the throughput in some tests. A comparison between QUIC with encryption disabled and plain TCP is even worse, with TCP winning by more than a factor of four in some cases.

Jesus, that's bad. Does anyone know if userspace QUIC implementations are also this slow?

I think the ‘fast’ claims are just different. QUIC is meant to make things fast by:

- having a lower latency handshake

- avoiding some badly behaved ‘middleware’ boxes between users and servers

- avoiding resetting connections when user up addresses change

- avoiding head of line blocking / the increased cost of many connections ramping up

- avoiding poor congestion control algorithms

- probably other things too

And those are all things about working better with the kind of network situations you tend to see between users (often on mobile devices) and servers. I don’t think QUIC was meant to be fast by reducing OS overhead on sending data, and one should generally expect it to be slower for a long time until operating systems become better optimised for this flow and hardware supports offloading more of the work. If you are Google then presumably you are willing to invest in specialised network cards/drivers/software for that.

Yeah I totally get that it optimizes for different things. But the trade offs seem way too severe. Does saving one round trip on the handshake mean anything at all if you're only getting one fourth of the throughput?

Are you getting one fourth of the throughput? Aren’t you going to be limited by:

- bandwidth of the network

- how fast the nic on the server is

- how fast the nic on your device is

- whether the server response fits in the amount of data that can be sent given the client’s initial receive window or whether several round trips are required to scale the window up such that the server can use the available bandwidth

It depends on the use case. If your server is able to handle 45k connections but 42k of them are stalled because of mobile users with too much packet loss, QUIC could look pretty attractive. QUIC is a solution to some of the problematic aspects of TCP that couldn't be fixed without breaking things.

The primary advantage of QUIC for things like congestion control is that companies like Google are free to innovate both sides of the protocol stack (server in prod, client in chrome) simultaneously. I believe that QUIC uses BBR for congestion control, and the major advantage that QUIC has is being able to get a bit more useful info from the client with respect to packet loss.

This could be achieved by encapsulating TCP in UDP and running a custom TCP stack in userspace on the client. That would allow protocol innovation without throwing away 3 decades of optimizations in TCP that make it 4x as efficient on the server side.

Is that true? Aren’t lots of the tcp optimisations about offloading work to the hardware, eg segmentation or tls offload? The hardware would need to know about your tcp-in-udp protocol to be able to handle that efficiently.

Most hardware is fairly generic for tunneled protocols, and tx descriptors can take things like "inner l4 header offset/len" and "outer l4 header offset/len"

Generic support for tunneled TCP is far more doable than support for a new and volatile protocol.

Maybe it’s a fourth as fast in ideal situations with a fast LAN connection. Who knows what they meant by this.

It could still be faster in real world situations where the client is a mobile device with a high latency, lossy connection.

There are claims of 2x-3x operating costs on the server side to deliver better UX for phone users.

> - avoiding some badly behaved ‘middleware’ boxes between users and servers

Surely badly behaving middleboxes won't just ignore UDP traffic? If anything, they'd get confused about udp/443 and act up, forcing clients to fall back to normal TCP.

Your average middlebox will just NAT UDP (unless it's outright blocked by security policy) and move on. It's TCP where many middleboxes think they can "help" the congestion signaling, latch more deeply into the session information, or worse. Unencrypted protocols can have further interference under either TCP or UDP beyond this note.

QUIC is basically about taking all of the information middleboxes like to fuck with in TCP, putting it under the encryption layer, and packaging it back up in a UDP packet precisely so it's either just dropped or forwarded. In practice this (i.e. QUIC either being just dropped or left alone) has actually worked quite well.

Yes. msquic is one of the best performing implementations and only achieves ~7 Gbps [1]. The benchmarks for the Linux kernel implementation only get ~3 Gbps to ~5 Gbps with encryption disabled.

To be fair, the Linux kernel TCP implementation only gets ~4.5 Gbps at normal packets sizes and still only achieves ~24 Gbps with large segmentation offload [2]. Both of which are ridiculously slow. It is straightforward to achieve ~100 Gbps/core at normal packet sizes without segmentation offload with the same features as QUIC with a properly designed protocol and implementation.

[1] https://microsoft.github.io/msquic/

[2] https://lwn.net/ml/all/cover.1751743914.git.lucien.xin@gmail...

Yes, they are. Worse, I’ve seen them shrink down to nothing in the face of congestion with TCP traffic. If Quic is indeed the future protocol, it’s a good thing to move it into the kernel IMO. It’s just madness to provide these massive userspace impls everywhere, on a packet switched protocol nonetheless, and expect it to beat good old TCP. Wouldn’t surprise me if we need optimizations all the way down to the NIC layer, and maybe even middleboxes. Oh and I haven’t even mentioned the CPU cost of UDP.

OTOH, TCP is like a quiet guy at the gym who always wears baggy clothes but does 4 plates on the bench when nobody is looking. Don't underestimate. I wasted months to learn that lesson.

Why is QUIC being pushed, then?

From what I understand the ”killer app” initially was because of mobile spotty networks. TCP is interface (and IP) specific, so if you switch from WiFi to LTE the conn breaks (or worse, degrades/times out slowly). QUIC has a logical conn id that continues to work even when a peer changes the path. Thus, your YouTube ads will not buffer.

Secondary you have the reduced RTT, multiple streams (prevents HOL blocking), datagrams (realtime video on same conn) and you can scale buffers (in userspace) to avoid BDP limits imposed by kernel. However.. I think in practice those haven’t gotten as much visibility and traction, so the original reason is still the main one from what I can tell.

MPTCP provides interface mobility. It's seen widespread deployment with the iPhone, so network support today is much better than one would assume. Unlike QUIC, the changes required by applications are minimal to none. And it's backward compatible; an application can request MPTCP, but if the other end doesn't support it, everything still works.

It has good properties compared to tcp-in-tcp (http/2), especially when connected to clients without access to modern congestion control on iffy networks. http/2 was perhaps adopted too broadly; binary protocol is useful, header compression is useful (but sometimes dangerous), but tcp multiplexing is bad, unless you have very low loss ... it's not ideal for phones with inconsistent networking.

I know in the p2p space, peers have to send lots of small pieces of data. QUIC stops stream blocking on a single packet delay.

because it _does_ provide a number of benefits (potentially fewer initial round-trips, more dynamic routing control by using UDP instead of TCP, etc), and is a userspace softare implementation compared with a hardware-accelerated option.

QUIC getting hardware acceleration should close this gap, and keep all the benefits. But a kernel (software) implementation is basically necessary before it can be properly hardware-accelerated in future hardware (is my current understanding)

To clarify, the userspace implementation is not a benefit, it's just that you can't have a brand new protocol dropped into a trillion dollars of existing hardware overnight, you have to do userspace first as PoC

It does save 2 round-trips during connection compared to TLS-over-TCP, if Wikipedia's diagram is accurate: https://en.wikipedia.org/wiki/QUIC#Characteristics That is a decent latency win on every single connection, and with 0-RTT you can go further, but 0-RTT is stateful and hard to deploy and I expect it will see very little use.

The problem it is trying to solve is not overhead of the Linux kernel on a big server in a datacenter

Google wants control.

QUIC performance requires careful use of batching. Using UDP spckets naively, i.e. sending one QUIC packet per syscall, will incur a lot of oberhead - every time the kernel has to figure out which interface to use, queue it up on a buffer, and all the rest. If one uses it like TCP, batching up lots of data and enquing packets in one “call” helps a ton. Similarly, the kernel wireguard implementation can be slower than wireguard-go since it doesn’t batch traffic. At the speeds offered by modern hardware, we really need to use vectored I/O to be efficient.

I would expect that a protocol such as TCP performs much better than QUIC in benchmarks. Now do a realistic benchmark over roaming LTE connection and come back with the results.

Without seeing actual benchmark code, it's hard to tell if you should even care about that specific result.

If your goal is to pipe lots of bytes from A to B over internal or public internet there probably aren't make things, if any, that can outperform TCP. Decades were spent optimizing TCP for that. If HOL blocking isn't an issue for you, then you can keep using HTTP over TCP.

IMO being Google's proprietary crap is enough reason to stay away. It not actually being any better is an even more compelling reason.

It's not proprietary. It is an IETF standard and several of the authors are not from Google.

The same can be said of the web "standards", and yet look what happened there...

It’s an interesting testament to how well designed TCP is.

IMO, it's more a testament to how fast hardware designers can make things with 30 years to tune.