This has always confused me... is Python really that much better at rapid dev? I work on a Python project and every day I wish the people that started the project had chosen a different language that actually scaled well with the problem rather than Python, which they likely chose because it was for "rapid dev".
You can run Python processes in parallel for "scaling". Youtube and Uber run python backends. This is cheaper than developer time per hour.
Sure, there's multiprocessing, but historically no multithreading (relatively recently there is the free-threading interpreter). Each of those processes / threads will also execute slowly compared to most other languages. But we agree Python is not performance oriented... I'm just curious why people think it's a good trade-off when I suspect that writing the same code in say, Java, will take roughly the same time, be easier to maintain (compiler assisted refactoring, type safety, etc), and execute faster.
After having worked with Java extensively during my time at Amazon, especially during the log4shell bullshit, I can safely say that if you think Java is viable for anything these days, you lack real world experience of how actual good, efficient dev process is one.
But on a slightly less edgy note: java doesn't have a fast REPL loop, near anywhere near the amount of libraries available for it as Python does, and requires tuning to actually make it performant. Its by far not the same amount of time to develop. It doesn't have things like jupyter notebooks for prototyping, doesn't have an easy to use interpreter, doesn't have ability to inject code during debug sessions, and doesn't have easy system integration (i.e with a module, you can install it in editable more, you can have an executable you can run from command line, e.t.c), and doesn't have things like dynamic import (where you can dynamically reload modules as the program is running, can't (easily) import inside function statements, and so on)
>Sure, there's multiprocessing, but historically no multi-threading
Functionally, when the cpu switches context the main thing that matters is the stack frame. This happens in both threading or running multiprocessing. The only difference is between the two is a) data access and b) initial startup time.
a is solved by unix pipes,which is already a part of multiprocessing in python, which are plenty fast for anything you wanna do, and b is solved by spinning up the processes ahead of time in worker pools, which is already part of multiprocessing in python.
You appear to have been traumatized by Java... You can sub in any statically typed, compiled language.
The key distinction between multiprocessing and multithreading is that in multithreading all the threads of a process share the same memory. Are you saying that Python uses pipes for providing shared access to memory across processes? It looks like there is: multiprocessing.shared_memory, but this all seems more complicated than it is in other languages. Perhaps it's not?
I will 100% give you that Python has more libraries, but that's the only real advantage I was able to glean.
No im saying that you can transport data over unix sockets, which are in memory, and python has that as part of multiprocessing.
The way python multiprocessing is set up is that it acts like a thread - when you launch a function, it copies the memory of the current process into the new one. You pay the startup overhead, so instead you launch worker processes ahead of time, and transport data to them over unix pipes. Pretty much almost as fast as threading.
>You appear to have been traumatized by Java
Not really, I just have standards. Java is just a poorly designed language with poor community support. Look at the amount of code it takes to send an HTTP request in Java, versus Python.
> it copies the memory of the current process into the new one
And if that memory is large enough then you OOM, so you would have to manage shared memory separately to prevent it from being copied into each process. It's not impossible, just complicated for some use-cases where using threads is a preferable paradigm.
> Look at the amount of code it takes to send an HTTP request in Java, versus Python
This is kind of my point.... (Java)
versus (Python) Yes, it's more verbose, but not to the extent that I'd trade the compiler and performance for the less verbose version especially considering auto-complete in IDEs.>And if that memory is large enough then you OOM
OOM is not an issue these days lol. Memory is dirt cheap, and the amount of memory that gets copied is miniscule per process.
There is still also the design of the application, just like with threading.
>(Java)
Forgetting the main class there, and also any additional headers that you need to send.
> the amount of memory that gets copied is miniscule per process
This is what is not always true. Sometimes you have a processing context that is GB in size, in which case scaling via multiple processes is not as simple as if you had access to threads that share that context by design. You will run out of memory really quickly if you spin up enough processes, and even if you have unlimited RAM you could have used 1/nth the memory. If you implement some method of sharing memory between processes to consume less overall, it will still come at the expense of speed and complexity.
> Forgetting the main class there
In Java 21+ you can omit it:
But this is neither here nor there, I think it's well worth paying that tax. You only type it once, right?Here is something I find quite funny, when I search multithreading vs multiprocessing on google I get the following AI response:
So, it assumes that I'm using Python... and that's why threads are not preferable... got it.I worked on a python codebase for several years that had over two million LoC. The code was well-organized and under constant development by at least ten devs. There were Java components in the larger project that were occasionally touched on, and it was always slower to work with those parts.
A well organized python codebase really is faster to work on than other languages. If your dev work is slowing down as the codebase grows, that's a sign you're accruing tech debt by not keeping up with organizing it.