"There have been individual clients downloading the exact same 20-GB file 100s of times per day, for several days in a row. (Just the other day, one user has managed to download almost 10,000 copies of the italy-latest.osm.pbf file in 24 hours!) Others download every single file we have on the server, every day."
This sounds like problem rate-limiting would easily solve. What am I missing. The page claims almost 10,000 copies of same file were downloaded by the same user
The server operator is able to count the number of downloads in a 24h period for an individual user but cannot or will not set a rate limit
Why not
Will the users mentioned above (a) read the operator's message on this web page and then (b) change their behaviour
I would be bet against (a) and therefore (b) as well
Geofabrik guy here. You are right - rate limiting is the way to go. It is not trivial though. We use an array of Squid proxies to serve stuff and Squid's built-in rate limiting only does IPv4. While most over-use comes from IPv4 clients it somehow feels stupid to do rate limiting on IPv4 and leave IPv6 wide open. What's more, such rate-limiting would always just be per-server which, again, somehow feels wrong when what one would want to have is limiting the sum of traffic for one client across all proxies... then again, maybe we'll go for the stupid IPv4-per-server-limit only since we're not up against some clever form of attack here but just against carelessness.
Stick tables work with either IPv4 or IPv6