we are having occasional lows in page speed performance due to LARGE amounts of bot traffic. full disclosure - we've not really been able to resolve this fully/well. Let us know if you have a good idea for how to deal with it
we are having occasional lows in page speed performance due to LARGE amounts of bot traffic. full disclosure - we've not really been able to resolve this fully/well. Let us know if you have a good idea for how to deal with it
If it's purely bot traffic, then Anubis could help
You could have seen it on some websites already
https://anubis.techaro.lol/
Do you host a torrent?
I have about 50k of the books, I would have used a torrent of just the txt files if it was prominent.
I'm only a small-scale sysadmin but the way that I understand the internet is that you send abuse notifications to the IP address block owner and, if it doesn't get resolved, you block. The whois/rdap database reveals which IPs all belong to the same hosting provider or ISP, so you can summarize that all to one list of IP addrs + timestamps per some time period
The ISP actually knows which subscriber is on that line, can send them notices, block them, terminate them... loads of things that you simply cannot do because you have no relation to this person. And frankly I wouldn't want to need to have a personal relation with every website that I visit; my ISP can reach me if there is anything relevant to continued use of the internet. From personal experience, when I was a teenager, the ISP cutting our household off after an abuse report was an effective way of stopping what I was doing
It’s effective against teenagers maybe. Not so much against Amazon, Meta or wherever botnet/crawler is coming out of China these days from up-and-coming AI companies.
I mean you could block entire AS numbers that relate to amazon or big tech datacenters
wouldn't help, much of the traffic we've observed look closer to ddos patterns - IPs from all over the world, many different networks, each IP makes one request only, doesn't come back. highly distributed, no form of blocking would be effective except maybe captcha or proof of work.
CF cache?