Hacker News

What do you need Matrix Cores for when you already have a NPU which can access the same memory, and even seems to include more flexible FPGA fabric? It's six of one, half a dozen of another.

SomeHacker44 4 days ago [ - ]

I have the HP Zbook G1a running the same CPU and RAM under HP Ubuntu. I have not seen any OOTB way to use the TPU. I can get ROCm software to run but it does not use it. No system tools show its activity that I can see. It seems to be a marketing gimmick. Shame.

4 days ago [ - ]

[deleted]

transpute 4 days ago [ - ]

https://news.ycombinator.com/item?id=43671940#43674311

> The PFB is found in many different application domains such as radio astronomy, wireless communication, radar, ultrasound imaging and quantum computing.. the authors worked on the evaluation of a PFB on the AIE.. [developing] a performant dataflow implementation.. which made us curious about the AMD Ryzen NPU.

> The [NPU] PFB figure shows.. speedup of circa 9.5x compared to the Ryzen CPU.. TINA allows running a non-NN algorithm on the NPU with just two extra operations or approximately 20 lines of added code.. on [Nvidia] GPUs CUDA memory is a limiting factor.. This limitation is alleviated on the AMD Ryzen NPU since it shares the same memory with the CPU providing up to 64GB of memory.

bigyabai 4 days ago [ - ]

The NPU is generally pretty weak and not pipelined into the GPU's logic (which is already quite large on-die). It feels like the past 10 years have taught us that if you're going to create tensor-specific hardware then it makes the most sense to put it in your GPU and not a dark-silicon coprocessor.

Archit3ch 4 days ago [ - ]

Can you do GPU -> NPU -> GPU for streaming workloads? The GPU can be more flexible than Tensor HW for preprocessing, light branching, etc.

Also, Strix Halo NPU is 50 TOPS. The desktop RDNA 4 chips are into the 100s.

As for consumer uses, I mentioned it's an open question. Blender? FFmpeg? Database queries? Audio?