Hacker News

Exactly. The problem is that by their very nature some content has to be dynamically generated.

Just to add further emphasis as to how absurd the current situation is. I host my own repositories with gotd(8) and gotwebd(8) to share within a small circle of people. There is no link on the Internet to the HTTP site served by gotwebd(8), so they fished the subdomain out of the main TLS certificate. I am getting hit once every few seconds for the last six or so months by crawlers ignoring the robots.txt (of course) and wandering aimlessly around "high-value" pages like my OpenBSD repository forks calling blame, diff, etc.

Still managing just fine to serve things to real people, despite me at times having two to three cores running at full load to serve pointless requests. Maybe I will bother to address this at some point as this is melting the ice caps and wearing my disks out, but for now I hope they will choke on the data at some point and that it will make their models worse.