A painful rewrite in another language is usually the only option in my experience.

If you're really lucky you have a small hot part of the code and can move just that to another language (a la Pandas, Pytorch, etc.). But that's usually only the case for numerical computing. Most Python code has its slowness distributed over the entire codebase.

It’s not painful, that’s the point. You have a working prototype now ready to port. (If the destination language is painful perhaps, but don’t do that.)

I recently ported a Python program to Rust and it took me much less time the second time, even though I write Rust more slowly per-line. Because I knew definitively what the program needed.

And if even that is too much optimizing the Python or adding Cython to a few hot loops is less difficult.

I have also ported a Python program to Rust (got a ~50x speedup) but this was a smallish program, under 10k lines of code.

Porting larger programs is rarely tractable. You can tell that because several large companies have decided that writing their own Python runtimes that are faster is less effort (although they all eventually gave up on that as far as I know).

Could happen with any language. The well-known Spolsky piece was about a C++ to C++ rewrite if memory serves. Blaming repeated poor decisions on a prototyping/glue language is yet another instance. Luckily there's lots of options today to dig out of them.