Every time someone does something with threading and makes it a language feature it always seems like it could just be done with stock C++.

Whatever this is doing could be wrapped up in another language.

Either way it's arguable that is even a good idea, since dealing with a regular thread in the same memory space, getting data to and from the GPU and doing computations on the GPU are all completely separate and have different latency characteristics.