> threads are inherently and inescapably too heavy weight to express concurrency in an efficient way

Your premise is wrong. There are many counterexamples to this.

Can you explain more ? I always heard this.

The most promiment example is probably Go with its goroutines, but there are so many more. You can easily spawn tens of thousands of goroutines, with low overhead and great performance.

Goroutines/"fibers"/"green threads" are usually scheduled by the runtime system across a small pool of actual OS threads.

The word "thread" is confusing things. In computer science a thread represents a flow of execution, which in concrete terms where execution is a series of function calls, is typically a program counter and a stack.

There are many ways to implement and manage threads. In Unix-like and Windows systems a "thread" is the above, plus a bunch of kernel context, plus implicit preemptive context switching. Because Unix and Windows added threads to their architectures relatively late in their development, each thread has to behave sort of like its own process, capable of running all the pre-existing software that was thread-agnostic. Which is why they have implicit scheduling, large userspace stacks, etc.

But nothing about "thread" requires it to be implemented or behave exactly like "OS threads" do in popular operating systems. People wax on about Async Rust and state machines. Well, a thread is already state machine, too. Async Rust has to nest a bunch of state machine contexts along with space for data manipulated in each function--that's called a stack. So Async Rust is one layer of threading built atop another layer of threading. And it did this not because it's better, but primarily because of legacy FFI concerns and interoperability with non-Rust software that depended on the pre-existing ABIs for stack and scheduling management.

Go largely went in the opposite direction, embracing threads as a first-class concept in a way that makes it no less scalable or cheap than Rust Futures, notwithstanding that Go, too, had to deal with legacy OS APIs and semantics, which they abstracted and modeled with their G (goroutine), M (machine), P (processor) architecture.

I thought it was obvious from context: OS threads are too heavyweight for fine grained concurrency

Go uses userspace threads. It’s also interesting that Go and Java are the only mainstream languages to have gone this route. The reason is that it has a huge penalty when calling FFI of code that doesn’t use green threads whereas this cost isn’t there for async/await.

Also that you have to rewrite the entire standard library, because the kernel knows how to suspend kernel threads on syscalls, but not green threads. (Go and Java already had to do this anyway, of course.)