DGX Spark is really poor at inference due to the memory bandwidth so hopefully they’ve fixed that before touting this as a way to run local models.

I think DGX Spark has poor memory bandwidth because these laptops were the plan all along. NVIDIA didn't want to commit to the extra costs of a 512-bit memory bus for their first laptop SoC, so they went with the more modest 256-bit bus, same as AMD did for Strix Halo.