> Right, but it's already doing that, and runs just fine, from what I understand. The developers don't have to sit there pounding the enter key on their keyboards over and over all day to keep the messages flowing.

> Is the user count and message rate growing so quickly that people are constantly needing to make architectural changes and performance improvements in order to keep it scaling up? Does adding new capacity need constant human intervention?

> Or are they adding new crazy features all the time that are genuinely challenging to implement?

> As a software developer who has worked on big distributed systems, I'm well aware that things take a lot more work than they often seem from the outside, but this strains belief.

IMO based on working on not-that-large-or-high-revenue systems, but ones where these things already applied, a bunch of it is probably a combination of three things:

* You're doing enough total revenue that a couple million a year to fund a team of engineers to try to make tiny marginal improvements in ad revenue through targeting, or new features on how to present ads, etc, can still easily pay for itself.

* You're running at a high enough scale / spending enough on resources that you can similarly justify spending millions on teams to knock more millions off your infra costs.

* You've got enough usage/users that making tiny improvements in bug rates/crashes/etc similarly results in more usage that more-than-pays-for-itself. (And the list of bugs to squash is possibly never-ending if those other groups keep changing things!)

"Why make 30M profit on 100M revenue when you can make 35M profit on 115M revenue" sorta thing.