Anyone done any benchmarks on the NV4FP quant? Seriously considering pitching an 8 x RTX 6000 Pro box at work to run GLM-5.2 in an air gapped environment.

At that price point you could also go with a Tenstorrent Galaxy Blackhole, which starts at $110,000.

Ooh, I hadn't seen these yet! That looks quite compelling, my only hesitancy would be what the software support looks like. But 1 TB of memory for $110k is really intriguing - I might go bother a sales rep. Thanks!

Good luck. I’m in the legal field, and even there, selling airgapped is tough.

What are the challenges you've seen in selling air gapped? Is it the high upfront cost? Challenges with hardware maintenance or something else?

We already use AWS. Everyone else is using AWS, so if there's an issue we can just say we were following industry standards.

My issue is we likely can't use AWS (non-US, CLOUD Act concerns + export control concerns).