Do you ever get to a place where you actually have to scale it? Like, the PC from my teenager age would probably be more than fast enough for 80% of companies' data.
Also, are you sure your data is actually in a correct view across all these "realms"?
Basically I’m optimizing for viral growth, otherwise the business probably isn’t worth it. I haven’t launched yet but I estimate that at 100k DAU vertical scaling/using a single instance would become a nightmare because of throughput and latency rather than data size.
I’m admittedly using a strange architecture for my use case and I realize now commenting here opened too big a can of worms as explaining exactly what I’m doing would derail the thread. Suffice it to say, my db doesn’t contain just “Bob ordered 5 widgets to Foo Lane” data.
But yes, using a more horizontal database strategy makes it very easy to manage data across realms. That’s one of the main benefits. A single DB would be much harder as far as isolation and separating test/production traffic (assuming this is what you mean by views) than having multiple separable dbs that only talk to dev/staging. And I can easily wipe out and isolate dev and staging this way. I’m frankly shocked people would advocate a single db that doesn’t allow you to do this.
> Basically I’m optimizing for viral growth, otherwise the business probably isn’t worth it. I haven’t launched yet but
If you have not launched yet but are optimizing for facebook-scale, that's not the optimal approach.
I can't comment on your database experience since I don't know it, but the vast, vast majority of people underestimate by orders of magnitude what a database can handle.
If you're not a large public company we all know about (and you're not, if you haven't launched yet), you don't need all the horizontal scale you seem to be building.
I remember joining one company (still a startup but a large one about to IPO). My day#1 briefing was about how this one database was in urgent need of replacement with a dozen+ node cassandra cluster because it was about to exceed it's capacity any second now. That was to be my highest priority project.
I took some measurements on usage and capacity and put that project on the backburner. The db was nowhere near capacity on the small machine it was running on. Company grew a lot, did an IPO, grew some more. Years later I left. That db was still handling everything with plenty of headroom left to grow more.
> that at 100k DAU vertical
That's chump change size even for a medium EC2/RDS instance, which should be capable of tens of millions of queries a day without the CPU or disk complaining at you (unless all your queries are table scans or unindexed).
> my db doesn’t contain just “Bob ordered 5 widgets to Foo Lane” data
It doesn't matter, it's still just bytes. What will matter is your query pattern relative to the databases query planner efficacy, and how updates/deletes impact this.
> makes it very easy to manage data across realms
You can just as easily do this at first as separate databases/schemas on the same physical server, with different users and permissions to prevent cross-database/schema joins so that when you need to move them to different machines it's an easier process.
Everyone I know that has tested isolated multi-tenancy that wasn't dependent on legal needs ended up abandoning this approach and consolidating into as little hardware as possible. Heap Analytics had a blog post a few years ago about this, but I can't seem to find it.
Regardless, hope you success in your endeavor and that you come back in a few months to prove us all wrong.
If it's a game, transactions are usually either very few per player per day (login, begin or finish playing a level, say something in a chat, spend money on a loot box, etc.) or easily sharded (e.g. 30 commands per second for each of 10 players in a multiplayer level that lasts 10 minutes, not for each player all the time).