at their scale they could also just run a large on-premise or rented (basically still cloud, but cheaper) GPU cluster and run through that. fixed costs, even license a SOTA model’s weights if you’d like
at their scale they could also just run a large on-premise or rented (basically still cloud, but cheaper) GPU cluster and run through that. fixed costs, even license a SOTA model’s weights if you’d like
The problem isn't really Uber, Microsoft or Nvidia, it's all the smaller none IT companies that also have developers on staff. They are screwed. $1500 per seat per month is just way to expensive, but they also can't afford to build and maintain their own on-premise solution. If Microsoft can't afford to run CoPilot for their own developer, what chance does any of their customers stand?
If the large, well founded IT companies in the world believes the current AI cost is to high, then Anthropic, OpenAI and CoPilot have no actual customer base. AI is then relegated to very profitable niche business, but that can't fund the R&D for the models.
It's an extra 18k a year for developer tools when they're paying how much a year per developer? Having software developers at all isn't cheap.
Also, I don't believe you need to spend $1500 a month on a coding agent if you optimize usage at all.
In Latvia, the net salary for a Java dev is around 1729 - 4314 EUR, based on https://www.algas.lv/algu-informacija/informacijas-tehnologi... (crowd sourced data)
For the employer those employees cost between 2945 - 7736 EUR per month based on https://kalkulatori.lv/lv/algas-kalkulators (income and social taxes).
So on the lower end that's (1500 USD ~ 1300 EUR) close to half the total expenses of such a developer, on the high end here around 15-20%. That's quite significant, depends on whether their productivity also improves (if that's what the orgs care about).
And we’re not even the country with the worst pay out there, but pay the same for tokens, cause regional pricing isn’t a thing!
I wonder how this plays out. Perhaps programmers in these countries will use cheaper models like Deepseek and they will be able to compete better, so offshoring continues?
> Perhaps programmers in these countries will use cheaper models like Deepseek and they will be able to compete better, so offshoring continues?
Even here, companies don't really trust Eastern providers that much, so they'd be looking for someone in the EU running DeepSeek instances, which might come with a bit of markup. Those orgs would also sometimes be weary of OpenRouter which to me seems like shooting yourself in the foot by being so picky.
That said, DeepSeek V4 Pro (with Max reasoning) is pretty okay and I'm using it instead of Opus 4.8 (my Max 100 USD subscription weekly limits ran out today) and it can do stuff passably (even better than Mistral's offering and has nice context window), but compared to the amount of work I can get done with Anthropic's models, it keeps occasionally fucking up and I have to go back and correct it, so lots of token waste. Maybe it's close to SOTA from 6-12 months ago, though, which is pretty cool on its own, though - just less confidence in its output.
It's like trying to limit the costs and therefore not gaining the maximum added value from the technology. Similarly for those trying to run stuff on-prem, we don't really have the electrical grid here for large scale inference in-country, nor is anyone exactly salivating at the idea of dropping multiple tens of thousands of EUR to build out something passable. I do host some stuff on a bunch of Nvidia L4 cards (Qwen3.6 35B A3B) and while the model has its uses, it's also a far cry from SOTA.
So I guess it depends - compared to an Anthropic subscription it kinda sucks, but then again if you have to pay for Anthropic's tokens those are robbery and then DeepSeek looks like a no-brainer alternative.
$18k a year is a non starter in most companies. Ive seen companies balk at Intellij.
That depends on where you are. $18K is the equivalent of paying around 15% more for your developer.
In hcol locations yes, but in south of spain you can get full time talent for that figure. It's also an entry-level salary in eastern europe, with ukraine and turkey even being somewhat cheaper.
There's models for every price point. What was SOTA and stupid expensive to run a year ago is a cheap flash model today.
Why are smaller non-IT companies "screwed" because they can't pay out the nose for their developers' AI usage? They're non-IT companies, developers are presumably not on their critical path, or not their bottleneck. Developers can keep on writing code the old way, or doing it with a more reasonable AI spend. I don't see how this "screws" any company.
That was badly worded on my part, my intend was to indicate that there was no way they can or will pay $1500 per month per seat.
> even license a SOTA model’s weights if you’d like
Yeah, I bet all labs releasing SOTA models are more than happy to remove the main way they make money and let you run it locally, especially if you're a big spender like Uber who seems very willing to throw money into the sea as an experiment.
That's going to stop eventually, and I think at that point we're going to see business models more like the major CAD providers.
I don't think they'll have a choice, open weights models are not far behind. At some point it's essentially a commodity game
they also already do this…
Anthropic and OpenAI license to the public clouds. Google reportedly licenses to Apple. licensing to Fortune 100 companies running on their own infra is an obvious next step
it is a race to the bottom and I’m not sure the labs win that race. we’ll see!
I'm not sure the labs will win either. I wouldn't be surprised to see OpenAI & Anthropic just get acquired, either by Microsoft or Amazon and their models just become another product offering in their public cloud and and some hybrid on-prem offering like Azure Stack HCI or Azure Stack Hub (already basically a "cloud in a black box" that could become "AI in a box")