HN people working in these AI companies have commented to say they do this, and the timing correlates with the rise of AI companies/funding.
I haven't tried to find it in my own logs, but others have said blocking an identifiable AI bot soon led to the same pattern of requests continuing through a botnet.
And that problem was largely solved by robots.txt. AI scrapers are ignoring robots.txt and beating the hell out of sites. Small sites that have decades worth of quality information are suffering the most. Many of the scrapers are taking extreme measures to avoid being blocked, like using large numbers of distinct IP addresses (perhaps using botnets).
There's actually not much evidence of this, since the attack traffic is anonymous.
HN people working in these AI companies have commented to say they do this, and the timing correlates with the rise of AI companies/funding.
I haven't tried to find it in my own logs, but others have said blocking an identifiable AI bot soon led to the same pattern of requests continuing through a botnet.
Did HN people present evidence?
And a few decades ago, it would have been search engine scrapers instead.
And that problem was largely solved by robots.txt. AI scrapers are ignoring robots.txt and beating the hell out of sites. Small sites that have decades worth of quality information are suffering the most. Many of the scrapers are taking extreme measures to avoid being blocked, like using large numbers of distinct IP addresses (perhaps using botnets).