Yeah. I think I've only ever found one situation where offloading work to a worker saved more time than was lost through serializing/deserializing. Doing heavy work often means working with a huge set of data- which means the cost of passing that data via messages scales with the benefits of parallelizing the work.
I think the clues are all there in the MDN docs for web workers. Having a worker act as a forward proxy for services; you send it a URL, it decides if it needs to make a network request, it cooks down the response for you and sends you the condensed result.
Most tasks take more memory in the middle that at the beginning and end. And if you're sharing memory between processes that can only communicate by setting bytes, then the memory at the beginning and end represents the communication overhead. The latency.
But this is also why things like p-limit work - they pause an array of arbitrary tasks during the induction phase, before the data expands into a complex state that has to be retained in memory concurrent with all of its peers. By partially linearizing you put a clamp on peak memory usage that Promise.all(arr.map(...)) does not, not just the thundering herd fix.