That's definitely doable. Planning similar except more webscraping / newsfeed / monitoring like.

I've got 3x SBCs that can run the Gemma 4 26B MoE on NPU. Around 4W extra power, 3 tokens a second...so that can hammer away at tasks 24/7 without moving the needle on electricity bill

I wonder if some investment firms are already doing this internally at a large scale. (Probably.)

They are - I’ve seen it.

They just use APIs though. There is very little interest within them to do the model engineering and inference in house.