>I would strongly advise against this. GPUs are highly efficient when neighboring threads within a warp access neighboring data and follow largely the same code path. Even across warps, data locality is highly desirable.
Its a bit like saying writing code at all is bad though. Divergence isn't desirable, but neither is running any code at all - sometimes you need it to solve a problem
Not supporting divergence at all is a huge mistake IMO. It isn't good, but sometimes its necessary
>Could you kindly share a source for this? Shader Execution Reordering (SER) is available for Ray tracing, but it is not a general-purpose feature that can be used in generic compute shaders.
https://docs.nvidia.com/cuda/cuda-programming-guide/03-advan...
My understanding is that this is fully transparent to the programmer, its just more advanced scheduling for threads. SER is something different entirely
Nvidia are a bit vague here, so you have to go digging into patents if you want more information on how it works