This is where having a little bit of swap can help you out. Not because you need swap, but because swap use % and swap I/O rates are good indicators. Something like 512 MB to maybe 1 G; not something like 2x your memory (unless you're on a very small system, and then use min(2x memory, 512 MB); having too much swap extends the amount of time your system can be swapping to death before it actually dies.

If your swap use jumps 10 points in a small time frame, you are running out of memory quickly. If your swap use hits 50 % or 80% or [whatever threshold], without any big jumps you're running out of memory slowly.

If your swap I/O is all output, not a huge deal... you're swapping stuff you never read. If you've got a lot of swapping in, chances are you're swapping to death.

> The Core Problem: We are trying to write a "God Equation" for our load balancer. We started with row_count, which failed. We looked at disk usage, but that doesn't correlate with RAM because of lazy loading.

I'm a big fan of straight up even distribution of requests. It's simple and predictable, although it's not going to get you the best throughput, predictability and simplicity is often better than perfection. If you always send each node 1/Nth of requests, worst case of a node that is broken but looks up is that you're still sending it a share when it should get nothing; if you have some sort of utilization based metric, if it looks underutilized because it's just dropping requests and responding with success but empty, it sucks up all your requests. Alternatively, people have good results with select M nodes by metrics, and then random selection between those. But also, IMHO, you want to reduce the work your load balancer(s) do, because load balancing load balancers is hard.