The cost for consuming the firehose of the entire network is very low. So the actual cost that can blow up is storage and computation.

If you want to filter for events based on some heuristic (e.g. only from follows of server list), you can do that. You can then specialize that further. E.g. for ongoing threads that already pass your filter, you could add their IDs to an array, and accept all replies for those threads as well into your DB.

You already get a stream of everything so you can scale down what you write to DB to exactly the characteristics you need. Including keeping threads cohesive.