this is indicative to me that the exponential is slowing down. tool and model progress was huge in 2025 but has been pretty stale this year. the usage changes from anthropic, gemini, and openai indicate it's just a scale of economy issue now so unless there's a major breakthrough they're just going to settle down as vendors of their own particular similar flavor of apis.
I think it signals that they’ve been so successful that they need to ensure there is some direct financial back pressure on heavy users to ensure that their heavy token use is actually economically productive. That’s not a bad thing. Giving away stuff for free - or even apparently for free - encourages a poor distribution of value.
> I think it signals that they’ve been so successful that they need to ensure there is some direct financial back pressure on heavy users to ensure that their heavy token use is actually economically productive.
Jesus, the spin on this message is making me dizzy.
They finally try to stop running at a loss, and you see that as "they've been so successful"?
Here's how I see it: they all ran out of money trying to build a moat, and now realise that they are commodity sellers. What sort of profit do you think they need to make per token at current usage (which is served at below cost)?
How are they going to get there when less-highly-capitalised providers are already getting popular?
I built a web-scale infrastructure service that supports tens of millions of end users over a 15-year timeline. One of the most successful moves we made was to charge customers appropriately for their usage and to adjust how we calculate usage from time to time in order to tweak that feedback signal. It's amazing how customers learn to adapt in response to even very modest financial signals - in the aggregate.
I don't think there has been any exponential in terms on inference costs in the last couple of years. In fact, they have worsened as the same relevant hardware is more expensive and so is energy - and to top it off, to stay SotA companies are using larger models with higher running costs. But for some reason people are conflating the improvements in models with the cost of inference.
> this is indicative to me that the exponential is slowing down
I've also heard that, we're near the end of the exponential.
What makes you think that progress has stopped? Anecdotally I personally seem to think that it's accelerated, I am having conversations with ambitious non tech people and they now seem to be excited and are staying up late learning about cli and github. They seem to have moved beyond lovable and are actually trying to embed some agents in their small businesses, etc.
> They seem to have moved beyond lovable and are actually trying to embed some agents in their small businesses, etc.
That's the problem - these small businesses are writign code, models from last year are good enough for them, and as a small business they can easily shell out for hardware to self-host.
The minute businesses take-up AI for their business processed, the will to buy each employee a subscription is going to go the way of the dodo.
Honestly? It was the claude code leak that did it. There was a lot more smoke and mirrors than I anticipated, the poisoning tool calls, how their prompting is, how "messy" a lot of it was etc.
I meant that I thought the exponential with the models is slowing down (AGI, etc). The application though for regular people will continue to go forward.