Geofabrik guy here. You are right - rate limiting is the way to go. It is not trivial though. We use an array of Squid proxies to serve stuff and Squid's built-in rate limiting only does IPv4. While most over-use comes from IPv4 clients it somehow feels stupid to do rate limiting on IPv4 and leave IPv6 wide open. What's more, such rate-limiting would always just be per-server which, again, somehow feels wrong when what one would want to have is limiting the sum of traffic for one client across all proxies... then again, maybe we'll go for the stupid IPv4-per-server-limit only since we're not up against some clever form of attack here but just against carelessness.