one angle that hasn't come up here yet: ECH basically kills TLS fingerprinting as a bot detection signal

right now tools like Cloudflare Bot Management rely heavily on JA3/JA4 hashes - they fingerprint the ClientHello to identify scrapers vs real browsers. if the ClientHello is encrypted, that whole detection layer collapses. you can still do behavioral analysis and JS challenges, but the pre-HTTP layer that currently catches a huge chunk of naive bots - gone

curious how Cloudflare handles this internally given they're one of the biggest ECH adopters but also one of the biggest bot detection vendors. seems like they're eating their own lunch on this one, or they've already shifted their detection stack to not rely on it as much

Cloudflare can and must decrypt the ClientHello for the sites it serves in order to actually serve the traffic. Using ECH with CF means you use their ECH domain and their keys.

If you control the domain you're fingerprinting clients on, you can decrypt the inner ClientHello and fingerprint on that.

If you're not in control of the domain you're fingerprinting, then ECH is working as intended.

I don't expect naive bots to implement ECH any time soon, though. If a bot can't be bothered to download curl-impersonate, they won't pass any ECH flags either.

the naive bot point is true but the threat model that actually matters is the other end - the sophisticated bots that already do implement TLS spoofing. those are the ones using got-scraping, curl-impersonate, or custom TLS stacks specifically to pass JA3/JA4 checks. they're already past the "can't be bothered" threshold.

for that tier, ECH flips the dynamic a bit. right now detection can use JA3/JA4 as a positive signal - "this fingerprint matches Chrome 120, looks clean". with ECH, if the bot is running behind a CDN that terminates ECH (like Cloudflare), the server sees a decrypted ClientHello that looks like... a real Chrome on Cloudflare's infrastructure. the fingerprint is clean by construction.

so paradoxically ECH might make things harder for the sophisticated bot detection case while doing nothing about the naive case, which is sort of backwards from what you'd expect.

It doesn't prevent fingerprinting, stop spreading misinformation. It only prevents your ISP from knowing what website you're connecting to.

fair point, I should have been more precise. the server (Cloudflare in this case) still decrypts the inner ClientHello and can fingerprint it - jannesan and jeroenhd are right about that.

the part that changes is passive fingerprinting from third parties - network middleboxes, ISPs, DPI systems that have historically been able to read ClientHello parameters in transit and build behavioral profiles. that layer goes away. for bot detection specifically that matters less since detection happens at the server, so your correction stands for that use case.

the Cloudflare paradox I was gesturing at is maybe better framed as: for sites NOT on Cloudflare, ECH makes it harder for Cloudflare (as a network observer) to do pre-connection fingerprinting. but for their own CDN customers, they decrypt it anyway so nothing changes for them. the conflict is more theoretical than practical for their current product.

> the part that changes is passive fingerprinting from third parties - network middleboxes, ISPs, DPI systems

Right. Things that should never have been allowed to exist to begin with. Working as designed.

> the part that changes is passive fingerprinting from third parties

That's exactly what I said:

> It only prevents your ISP from knowing what website you're connecting to.

Why would Clownflare ever see traffic to sites not on Clownflare?

They do routing. Even if you're connecting to a non Cloudflare server, the traffic may still be routed through their servers.

Why would they want to peek traffic? Most likely for statistics (most frequently visited websites etc).

Can you give an example of a BGP route or traceroute to a site not on Clownflare that was routed through Clownflare?

It depends on the origin and the destination. Their Magic Transit service explicitly allow this, and I assume they have agreements with other AS in case something goes wrong on either side (it often does). You'd have to directly ask them to know specifically but I don't think they would answer since that's proprietary information.

Since most ISPs also maintain their own DNS resolver, they could always reverse lookup the IP address AFAIK.

The whole idea behind ECH is one IP hosts tons of sites (eg. CDN) so you have no idea which one it is.

Also reverse lookup has nothing to do with hosting own DNS resolver.

What you're describing is a SNI, not ECH. Those two serve very different purposes.

> Also reverse lookup has nothing to do with hosting own DNS resolver.

It has everything to do with that. Had you used two brain cells, you would've known that they can memorize the IP address and the domain name, and if you connect to that IP in a short period of time, most likely you visited that domain name.

SNI is unencrypted, so your ISP can see it. ECH encrypts it.

How does this relate to my comment?

True. ECH is useless if you're using plain DNS. DNS over TLS or HTTPS is the way to go.

What OP wrote seems correct:

> ECH basically kills TLS fingerprinting as a bot detection signal

They are not talking about fingerprinting in general. Please elaborate how else TLS fingerprinting can be done.

I am talking about TLS fingerprinting, not JS fingerprinting.

> Please elaborate how else TLS fingerprinting can be done.

By doing everything as it is right now?

How would you (an arbitrary web server) fingerprint a TLS connection if the Client Hello is encrypted?

The website owner (or cloudflare in this case) has the keys to decrypt the client hello. That's necessary for routing information.

You're right, sorry! I got confused myself.

By decrypting it? I don't think you know how TLS, or E2E works in general. ISP doesn't perform the fingerprinting, the server does.

Of course! My bad, thanks for engaging.

[dead]