A lot of that has already been solved for by scaling workers to cores along with techniques like greenlets/eventlets that support concurrency without true multithreading to take better advantage of CPU capacity.
A lot of that has already been solved for by scaling workers to cores along with techniques like greenlets/eventlets that support concurrency without true multithreading to take better advantage of CPU capacity.
That's great for concurrency, but doesn't improve parallelism.
Unless you mean you have multiple worker processes (or GIL-free threads).
But you are still more or less limited to one CPU core per Python process. Yes, you can use that core more effectively, but you still can't scale up very effectively.