I definitely noticed the performance boost on my Pixel 8, for some reason it seems to really not like wireguard-go, it struggled to pull even 100mbps, maybe something unoptimized on Google's custom hardware. With the new GotaTun version I can pull 500mbps+, though unfortunately it also seems to have introduced a bug that randomly prevents the phone from entering a deep sleep state, so occasionally my battery will randomly start draining at 10x normal speed if I have it enabled until I reboot.

I'm surprised by this comment. I have wireguard on 24/7 on my shitty Samsung A5 and it lasts forever. By comparison the Pixel 8 is a beast. Sounds like an Android bug more than wireguard.

Pixel 6 here. Vanilla wireguard app. It sucks the life out of my phone and nearly halves the already half-life battery (thanks Google for your crappy OEM producers!)

Thank samsung for their shitty modems in the pixels.

However, there’s going to be a large discrepancy for all devices on battery usage based on whether VPN is on wifi or cellular, and additionally when on cellular how close to the tower they are. I live near cell edge and VPN’s roast my batts on cellular no matter the make, in city it’s almost not noticeable to have VPN on. Better to use wifi when far from towers, cellular more efficient if it’s strong signal.

They must have fixed it in recent versions. My pixel 9 pro battery seems the same with proton VPN (wirrguard) on or off.

What app are you using?

It's just called WireGuard, by the "WireGuard Development Team" off google play.

Pretty sure thats the c implementation not the go one

AFAIK the C implementation is a kernel module that's not shipped in stock Android releases. The WireGuard Android app uses that module when available, but otherwise uses wireguard-go.

Good knowledge here, was unaware of this feature of the app. Would there be any case of the app defaulting to the wireguard kernel module if it's not included by any OEM Android release? I would assume that means most users are actually running wireguard-go.

I hope so.

Android kernels 4.19 and higher should all have support included for WireGuard unless the OEM specifically disables it: [0]. The Pixel 8 ships with the android 14 6.1 kernel so it most definitely should have WireGuard kernel support. You can check this in the WireGuard app BTW, if you go to settings it will show the backend that's in use.

[0] https://android-review.googlesource.com/c/kernel/common/+/14...

Kernel support should have no bearing as the apps are purely userspace apps. You can use the kernel mode if you root the phone, but that's not a typical scenario.

Well, the issue isn't kernel vs user space, but you are correct that you still need a custom ROM and/or root unfortunately. I had assumed Android had also allowed netlink sockets for WireGuard but alas they did not. So the app can't communicate with the kernel module, bummer.

Same behavior on raspberry pi 5. Might be just lack of arm optimizations.

It's very likely that VPNs like this are not CPU-bound, even on somewhat whimpy CPUs. I'd wager even some microcontrollers could sling 500megabits/sec around without trouble.

You’re in for a surprise then once you actually go look at the performance.

A Raspberry Pi 4 can manage something like 70Mbps of raw AES en/decryption flow: https://github.com/lelegard/aesbench/blob/main/RESULTS.txt

That CPU is pretty much a toy compared to (say) a brand-new M5 or EPYC chip, but it similarly eclipses almost any MCU you can buy.

Even with fast AES acceleration on the CPU/MCU — which I think some Cortex MCUs have — you’re really going to struggles to get much over 100Mbits of encrypted traffic handling, and that’s before the I/O handling interrupts take over the whole chip to shuttle packets on and off the wire.

Modern crypto is cheap for what you get, but it’s still a lot of extra math in the mix when you’re trying to pump bytes in and out of a constrained device.

You're looking at the wrong thing, WireGuard doesn't use AES, it uses ChaCha20. AES is really, really painful to implement securely in software only, and the result performs poorly. But ChaCha only uses addition rotation and XOR with 32 bit numbers and that makes it pretty performant even on fairly computationally limited devices.

For reference, I have an implementation of ChaCha20 running on the RP2350 at 100MBit/s on a single core at 150Mhz (910/64 = ~14.22 cycles per bytes). That's a lot for a cheap microcontroller costing around 1.5 bucks total. And that's not even taking into account using the other core the RP2350 has, or overclocking (runs fine at 300Mhz also at double the speed).

You’re totally right; I got myself spun around thinking AES instead of of ChaCha because the product I work on (ZeroTier) started with the initially and moved to AES later. I honestly just plain forgot that WireGuard hadn’t followed the same path.

An embarrassing slip, TBH. I’m gonna blame pre-holiday brain fog.

Yeah no, this is very much not true, even more so for a Go-based implementation and energy consumption optimized ARM devices.

MTU strikes again. 1320.

[deleted]

Why 1320 and not larger?

For most any 5G network you should be safe to 1420 - 80 = 1340 bytes if using IPv6 transport or 1420 - 60 = 1360 bytes if using IPv4 transport.

For testing I recommend starting from 1280 as a "does this even work" baseline and then tweaking from there. I.e. 1280 either as the "outside" MTU if you only care about IPv4 or as the "inside" MTU if you want IPv6 to work through the tunnel. This leverages that IPv6 demands a 1280 byte MTU to work.

Hah! I just ran into this recently and can confirm. The coax to my DOCSIS ISP was damaged during a storm, which was causing upstream channels to barely work at all. (Amusingly, downstream had no trouble.) While waiting for the cable person to come around later in the week, I hooked my home gateway device up to an old phone instead of the modem. I figured there would be consequences, but surprisingly, everything went pretty smoothly... But my Wireguard-encapsulated connections all hung during the TLS handshake! What gives?

The answer is MTU. The MTU on my network devices were all set to 1500, and my Wireguard devices 1420, as is customary. However, I found that 1340 ( - 80) was the maximum I could use safely.

Wait, though... Why in the heck did that only impact Wireguard? My guess is that TCP connections were discovering the correct MSS value automatically. Realistically that does make sense, but something bothers me:

1. How come my Wireguard packets seemed to get lost entirely? Shouldn't they get fragmented on one end and re-assembled on the other? UDP packets are IP packets, surely they should fragment just fine?

2. Even if they don't, if the Linux TCP stack is determining the appropriate MSS for a given connection then why doesn't that seem to work here? Shouldn't the underlying TCP connection be able to discover the safe MSS relatively easily?

I spelunked through Linux code for a while looking for answers but came up empty. Wonder if anyone here knows.

My best guess is that:

1. A stateless firewall/NAT somewhere didn't like the fragmented UDP packets because it couldn't determine the source/dest ports and just dropped them entirely

2. Maybe MSS discovery relies on ICMP packets that were not able to make it through? (edit: Yeah, on second thought, this makes sense: if the Wireguard UDP packets are not making it to their destination, then the underlying encapsulated packets won't make it out either, which means there won't be any ICMP response when the TCP stack sends a packet with Don't Fragment set.)

But I couldn't find anything to strongly support that.

Basically the only parts of the Internet which actually work reliably, around the globe, are the bits needed so that web pages basically kinda work. If you break literally everything else your service is crap, and some customers might notice, but many won't and also some won't have a choice so, sucks to be them. But if you break the Web, now everybody notices that you broke stuff and they're angry.

This is why DoH (DNS over HTTPS) is a thing. It obviously makes no actual sense to use the web protocol to move DNS packet, but, this works and most things don't work for everybody so eh, this is what we have. Smashing the Path MTU discovery doesn't break the web.

Breaking literally everything so long as the web pages work even means you can't upgrade parts of the web unless you get creative. TLS 1.3 the modern security protocol that is used for most of your web pages today, would not work for most people if it admitted that it's TLS 1.3, if you send packets with TLS version 1.3 on them people's "intelligent" "best in classs security" protective garbage (in the industry we call these "middle boxes") thinks it is being attacked by some unknown and unimaginable dastardly foe and kills the data. So TLS 1.3 really, I am not making this up, always pretends it is a TLS 1.2 re-connection, and despite the fact that no such connection ever existed these same "best in class security" technologies just have no idea what's happening and wave it through. It's very very stupid that they do that, but it was needed to make the web work, which matters, whereas actual security eh, suckers already bought the device, who cares.

This situation is deeply sad but, one piece of good news is that while "This Iranian woman can't even talk confidentially to her own mother without using code words because the people in charge there intercept her communications" won't attract as much sympathy as you'd like from some bearded white guy who has never left Ohio, the fact that those people broke his network protocol to do that interception infuriates him, and he's well up for ensuring they can't do that to the next version.

Your ultimate conclusion is correct, to my understanding. I know wireguard sought to be ultra minimal but I do wish they had included DPLPMTUD as something which is required to be supported (but not mandated to be used e.g. if the user wants to hard set it as they would currently) because it's one of those cases where "do it yourself separately the UNIX way™" or "have the tunneled things do it if they need it" instead are both significantly more complex and fragile.

On that note, from the TCP layer it should just look like an ICMP blackhole, which makes me wonder if enabling `net.ipv4.tcp_mtu_probing` will magically make TCP connections under Wireguard work even with the MTU set wrong. I'd try it, but unfortunately with a similar configuration I am unable to get the fragmentation behavior I was getting before; which makes me wonder if it was my UniFi Security Gateway that actually didn't like the fragmented packets.

I think the problem is on the Android side. I noticed the same behavior with ZeroTier and even with MizuDroid, all totally unrelated.

Oh, this is the reason the Mullvad app on my Pixel 6a was suddenly able to connect in less than a second where before it would take 5-10 seconds, nice!

Do you have wireguard keepalives on?