You hit the nail on the head. C++20 coroutines are indeed complex and the barrier to entry is high. However, that complexity actually forced me to start from first principles. It drove me to tackle the essential problems from the ground up, which gave me a much deeper understanding of how coroutines truly work. That is exactly why I built this project—I wanted to create a minimal "laboratory" to dissect stackless coroutines without the overhead of a massive framework like Seastar. Regarding your point on Erlang/Go: That's actually the goal of this scheduler! It implements the M:N threading model (Work-Stealing) to simulate that kind of "lightweight process" concurrency, but giving you manual control over the mechanics. Hope this helps you finally wrap your head around co_await!

I think there is some hope of a sane wrapper around C++20 coroutines that will make them easier to use. I saw a tutorial a while back mentioning that might eventually become part of the C++ standard. I once tried to use Boost coroutines but it was too much headache and I switched to a different approach.

Erlang's processes and Goroutines are stackful unlike C++ coroutines. Erlang also forbids observable data sharing between processes which avoids a lot of pitfalls. I don't think that can be enforced in C++ or Go.

GHC lightweight threads and its STM library (software transactional memory) could be another thing to look at. I wonder if a useful STM feature is feasible for tiny_coro.

You are spot on about the distinction. C++ chose the stackless path for "zero-overhead" efficiency, but it definitely shifts the burden of safety and usability onto the library developers. Regarding safety and data sharing: You're right, C++ won't enforce isolation like Erlang. That's the trade-off we make for performance. However, to address your point about "sane wrappers" and concurrency models: I actually implemented Go-style Channels (CSP) on top of this scheduler to bridge that gap. It uses co_await to mimic Go's channel behavior, supporting Direct Handoff (skipping the buffer/queue if a receiver is waiting) and optimization via await_suspend returning false to avoid context switches on the fast path. While a full-blown STM might be too heavy, I found that combining these Channels with Epoch-Based Reclamation (EBR) gives a pretty robust safety net without the overhead of heavy locking. If you're interested, the Channel implementation is here: https://github.com/lixiasky-back/tiny_coro-build_your_own_MN...