I have one, and I love it. That said my buddies Mac smokes it for inference workloads in terms of tokens per second AND its more usable for other things.

If you are training and doing research it's great, if you want to cluster them it cant be beat, but if you just want local inference on a single box buy a mac or even a strix halo device.

can those macs boot linux? i've heard about Asahi but have no idea how far along they are. i've got my fleet configured with nix and sure, nix can target darwin, but there's a _lot_ of sharp edges there: i don't really want to pull that thread unless i have to...

I don't know. I think he just uses LMStudio most of the time on his, but that's one place I can say the spark really shines for me.

I'm a Linux guy, but also don't always have alot of time. The Spark comes out of the box with a nice Linux distro that's pre-configured to be easy to setup and the guides and online resources make getting up and running trivial, for even some complex tasks. You would have to do a LOT of tinkering just to figure out some of the things the nvidia resources walk you through natively. They have guides for a ton of stuff that include the optimal settings so you don't have to figure it all out through trial and error.

Check out these "playbooks" for some examples. [0] There's a lot to be said for not having to piece all that together yourself.

https://build.nvidia.com/spark

I think between unboxing mine setting it up to run headless, and generating tokens was like 20 minutes total for me.

Not the new ones. Only the M1 and M2 have good support for Asahi. But you really don't need it. If you need Linux, use a VM (UTM is free and is equivalent to KVM/QEMU in speed, despite being a Type-2 Hypervisor.)

which mac is smoking the spark?

Mine, for one. M5 Max MacBook Pro 128GB with a 4TB SSD. $5100 after a $1000 discount at Microcenter. Great deal if you can find it in stock.

pretty much any of them, dude, as long as you have enough RAM, since it uses unified RAM and a powerful SoC CPU/GPU. Literally any M-class model, but the M5 is currently top tier.

The DGX Spark has basically the same memory bandwidth as a M5 Pro, and far more than a M5.

Only the M3 Ultra really beats it, and once you start scoping out the cost of a M3 Ultra with 128GB or 256GB, the DGX Spark doesn’t look bad after all.

> The DGX Spark has basically the same memory bandwidth as a M5 Pro, and far more than a M5.

I see ~274 GB/sec for the DGX Spark[1], versus 307 GB/sec for M5 Pro and 460 or 614 GB/sec for M5 Max[2]. One might call 90% "basically the same", but there are nominally two tiers above "Pro".

Yes, a MacBook Pro with 128 GB and M5 Max costs $5100 (14") or $5400 (16") versus currently $4700 for the DGX Spark, but the MBP includes keyboard, mouse, battery and portability. I believe its prefill is slower and you get 2 TB vs 4 TB SSD, but overall one gives up a lot to save 10% of the cost.

[1]- https://docs.nvidia.com/dgx/dgx-spark/hardware.html [2]- https://support.apple.com/en-us/126319

I looked, but a sibling comment just provided the links. ~274 GB/sec for the DGX Spark, vs. 307 GB/sec for M5 Pro, and max 614 GB/sec (!!!) for M5 Max? Why would you completely friggin’ lie about this, or at minimum, not double-check your facts before bullshitting? Plus, you get a full-fledged computer along with it!

Apple could actually be a good deal and you folks would still make up something to not justify it. In a way, it’s amazing what Apple has accomplished- Baseless negatively-tainted perception in certain influential tech circles.

(To be fair, they’re kind of earning it. I’m glad Tim “Sweet T” Cook is departing.)

Plus, my original comment got downvoted despite being factually-correct. Thanks, Reddit. Oh, wait…

Yep. Memory bandwidth is what decides how fast LLM's generate tokens (mostly). The DGX Spark has something like 270 GB/s of memory bandwidth, and the m5 ultra is ~615 GB/s. Theoretically DOUBLE the speed. In practice he only generates like 25% more tok/s, but that's still very impressive.

The spark can fine tune models in 1/4 the time and excels at other compute tasks in ways that Mac never can. Plus the high bandwidth ConnectX-7 ports would be like $1700 to buy on a card just for the network adapters... But for generating tokens, it just plain loses.

How noisy does his fan get…

it doesn’t get noisy at all