Stackful makes for cute demos, but you need huge per-thread stacks if you actually end up calling into Linux libc, which tends to assume typical OS thread stack sizes (8MB). (I don't disagree that some of the other tradeoffs are nice, and I have no love for C++20 coroutines myself.)