I'd love to learn more about this. What resources/books/articles/code can I look at to understand this more? Or, if you have some time, would you mind expanding on it?

The parts I'm specifically interested in: 1. What the 300 line pool and allocator look like 2. What this means: "BTW the for-case can simply be supported by setting a pool/global boolean and using that to decide how to wait for a new task (during the paralle for the boolean will be true, otherwise do sleeps with mutexes in the worst case for energy saving)"

Thank you!

This stuff is sometimes difficult to search for because people don’t name it or there are many different names.

Arena allocation on windows for example is basically calling VirtualAlloc for a couple gigabytes on a 64 bit system (you have terabytes of virtual memory available) and then slicing it up into sub-ranges that you pass as parameters to threads grouped hierarchically for each cpu and then within that each group of cores that share cache and then single cores for their own cache memory. Lock the software threads to their hardware threads and done. Then for each arena use bump and maybe pool allocators for most stuff. Very basic and little code, much higher performance than most software out there. It’s also why a lot of diehard C programmers find rust lifetime management overengineered and boring btw because you don’t have so many lifetimes as modern C++ code for example.

For the boolean stuff look at the “better software conference” youtube talk about that video game physics engine for example (sorry, I’m on my phone on the jump). Again, old ideas being rediscovered