Thundering herd is a known post-outage, service restore failure mode. You never let your load balancer and the API boxes behind it dip below some statically defined low water mark; the waste of money is better than going down when the herd shows up. When it does show up, as noted, token bucket rate limiter running 429s while the herd slowly gets let back onto the system. Even if the app server can take it, it's not a given that the eg queue or database systems can absorb the herd as well (especially if that's what went down).