We're planning to do the same thing - buy something like 8xH100 and run all coding there. The CTO almost agreed to find the budget for it but I need to make sure there are no risks before we buy (i.e. it's a viable/usable setup for professional AI-assisted coding)

Can you share what models you run and find best performing for this setup? That would help a lot. I already run a smaller AI server in the office but only 32b models fit there. I already have experience optimizing inference, I'm just interested what models you think are great for 8xH100 for coding, I'll figure out the details how to fit it :)

8 x h100 80's don't give you enough to run the latest 1tn + parameter models (especially at the context window lengths to be competitive with the frontier models)

Verda has B300 clusters, 8 for USD $55/hour in 10 minute billing blocks

Check out Verda you can rent whatever super powerful GPU clusters you need in 10 minute increments. Deploy any open weight model using SGLang and away you go

Deepseek, GLM, Minimax or Kimi are the most likely contenders.

I’ve been using kimi 2.5/2.6 for the past 2 weeks and it’s really not far off OpenAI and Claude models. I am a coder so it’s not all vibes but I am definitely more in the “spec to code” mode than “edit this file for me” and it copes just fine. Needs a bit more supervision than the frontier models but it’s also significantly cheaper. If I were anthropic I’d be shitting myself, their prices are going to 10x over the next 2 years

So are you running Kimi on Verda?