I think they could release non-agentic models that are as good as 4o, and have almost no repercussions on sales tbh.
I have Ollama installed (only a small proportion of their clients would have a large enough GPU for this) and have download deepseek and played with it, but I still pay for an OpenAI subscription because I want the speed of a hosted model, and never mind the luxuries of things like Codex's diffs/pull request support, agents on new models, deep research etc. - I use them all at least weekly.
I pay for Cursor, OpenAI and kimi (to use with Claude Code), OpenAI is good with quickly refining my thoughts, Cursor’s subscription I’m reconsidering to cancel bought it for Claude but the rate limits are making it impossible for me to find it useful. Kimi is what truly surprises me, Claude code shows this conversation costed you $500 (based on Opus usage which is mapped to kimi k2) while I’ve barely spent $2. I have Ollama as well, majorly to quickly test small models that could be improved for our usecase through finetuning.
I've been using Kimi with Roo via OpenRouter and have been very surprised at how capable it is. It's the first open model I've tried that actually lives up to claims I see online that's it on par with this or that previous gen proprietary model. Context window has been the only negative, at least with the providers OpenRouter has been giving me but forgiveable given how absurdly cheap it is.
> but forgiveable given how absurdly cheap it is
Are you using it everyday for programming? If so, how much more or less does it cost you per month? More or less than $100?
What am I doing wrong that I'm never hitting the rate limits on the $100 Max plan?
Considering my personal heavy use also not leading to rate limits and what I've seen by some users over the past months, I suspect a mix of actually thinking about your code before writing a prompt, managing context by documenting and running stuff like git, npm install, etc. yourself instead of "Hey Claude, setup React with Radix and install a few packages". I have genuinely seen someone use ultrathink for setting up a starter repo hosted on Github, despite the commands being listed in the readme, so I can see how certain people may hit the limits quicker than others. Still, I will cancel my Claude Max subscription if they remain intransparent concerning the amount of use we actually get, especially regarding the mail they sent out recently which stated that 20x Max users do only get 10x in terms of expected usable hours. Same goes for still not providing an official way to track how much use one has left in a week.
> running stuff like git, npm install, etc. yourself
Ah; this definitely makes sense! I do this myself and then paste back only the relevant part of the log so as to limit this. I suspect I am being more conservative than others.
I am on the pro plan, I was considering Max, but then i found kimi and I’m getting used to it.
Are you using kimi with Claude Code? Are you using it via OpenRouter?
With claude code and Kilo as well. I’m using moonshot’s API.
Thank you
Are you using a proxy to connect Claude code to Kimi?
And how much do you estimate it would cost in a month of daily usage?
They would definitely have sales repercussions, but it might be worth it.
They are fully trying to be a consumer product, developer services be damned. But they can’t just get rid of the API because it’s a good incremental source of revenue, and thanks to the Microsoft deal, all that revenue would end up in Azure. Maintaining their API is basically just a way to get a slice of that revenue.
But if they open sourced everything, it might sour the relationship more with Microsoft, who would lose azure revenue and might be willing to part ways. It would also ensure that they compete on consumer product quality not (directly) model quality. At this point, they could basically put any decent model in their app and maintain the user base, they don’t actually need to develop their own.