Can you do GPU -> NPU -> GPU for streaming workloads? The GPU can be more flexible than Tensor HW for preprocessing, light branching, etc.
Also, Strix Halo NPU is 50 TOPS. The desktop RDNA 4 chips are into the 100s.
As for consumer uses, I mentioned it's an open question. Blender? FFmpeg? Database queries? Audio?