Relative to the hype they've been spinning to attract investment, casting the launch and commercialization of ChatGPT as their greatest achievement really is a quite significant downgrade, especially given that they really only got there first because they were the first entity reckless enough to deploy such a tool to the public.
It's easy to forget what smart, connected people were saying about how AI would evolve by <current date> ~a year ago, when in fact what we've gotten since then is a whole bunch of diminishing returns and increasingly sketchy benchmark shenanigans. I have no idea when a real AGI breakthrough will happen, but if you're a person who wants it to happen (I am not), you have to admit to yourself that the last year or so has been disappointing---even if you won't admit it to anybody else.
ChatGPT was released two and a half years ago though. Pretty sure that at some point Sam Altman had promised us AGI by now.
The person you're responding to is correct that OpenAI feels a lot more stagnant than other players (like Google, which was nowhere to be seen even one year and a half ago and now has the leading model on pretty much every metric, but also DeepSeek, who built a competitive model in a year that runs for much cheaper).
o3-mini wasn't even the second place for non-STEM tasks, and in today's announcement they don't even publish benchmarks for those. What's impressive about Gemini 2.5 pro (and was also really impressive with R1) is how good the model is for a very broad range of tasks, not just benchmaxing on AIME.
I had a philosophical discussion with o3 model earlier today. It was much better than 2.5 pro. In fact it was pretty much what I would expect from a professional philosopher.
Writing philosophy that looks convincing has been a thing LLM do well since the first release ChatGPT back in 2022 (in my country back in early 2023, TV featured a kind of competition between ChatGPT and a philosopher turned media personality, with university professors blindly reviewing both essays and attempting to determine which was whom).
To have an idea about how good a model is on non-STEM tasks, you need to challenge it on stuff that is harder than this for LLMs, like summarization without hallucination or creative writing. OpenAI's nonthinking model are usually very good on these, but not their thinking models, whereas other players (be it Google, Anthropic or DeepSeek) manage to make models that can be very good at both.
I've been discussing a philosophical topic (brain uploading) with all major models in the last two years. This is a topic I've read and thought about for a long time. Until o3, the responses I got from all other models (Gemini 2.5 pro most recently) have been underwhelming - generic, high level, not interesting to an expert. They struggled to understand the points I was making, and ideas I wanted to explore. o3 was the first model that could keep up, and provide interesting insights. It was communicating on a level of a professional in the field, though not an expert on this particular topic - this is a significant improvement over all existing models.
The play now seems to be less AGI, more "too big to fail" / use all the capital to morph into a FAANG bigtech.
My bet is that they'll develop a suite of office tools that leverage their model, chat/communication tools, a browser, and perhaps a device.
They're going to try to turn into Google (with maybe a bit of Apple and Meta) before Google turns into them.
Near-term, I don't see late stage investors as recouping their investment. But in time, this may work out well for them. There's a tremendous amount of inefficiency and lack of competition amongst the big tech players. They've been so large that nobody else could effectively challenge them. Now there's a "startup" with enough capital to start eating into big tech's more profitable business lines.
I don't know how anyone could look at any of this and say ponderously: it's basically the same as Nov 2022 ChatGPT. Thus strategically they're pivoting to social to become too big to fail.
I mean, it's not fucking AGI/ASI. No amount of LLM flip floppery is going to get us terminators.
If this starts looking differently and the pace picks up, I won't be giving analysis on OpenAI anymore. I'll start packing for the hills.
But to OpenAI's credit, I also don't see how minting another FAANG isn't an incredible achievement. Like - wow - this tech giant was willed into existence. Can't we marvel at that a little bit without worrying about LLMs doing our taxes?
I'm bullish on the models, and my first quiet 5 minutes after the announcement was spent thinking how many of the people I walked past days would be different if the computer Just Did It(tm) (I don't think their day would be different, so I'm not bullish on ASI-even-if-achieved, I guess?)
I think binary analysis that flips between "this is a propped up failure, like when banks get bailouts" and "I'd run away from civilization" isn't really worth much.
Most people don't care about techies or tech drama. They just use the platforms their friends do.
ChatGPT images are the biggest thing on social media right now. My wife is turning photos of our dogs into people. There's a new GPT4o meme trending on TikTok every day. Using GPT4o as the basis of a social media network could be just the kickstart a new social media platform needs.
chatGPT should be built into my iMessage threads with friends. @chatGPT "Is there an evening train on Thursdays from Brussels to Berlin?" Something a friend and I were discussing but we had to exit out of iMessage and use GPT then back to iMessage.
For UX The GPT info in the thread would be collapsed by default and both users have the discretion to click to expand the info.
Relative to the hype they've been spinning to attract investment, casting the launch and commercialization of ChatGPT as their greatest achievement really is a quite significant downgrade, especially given that they really only got there first because they were the first entity reckless enough to deploy such a tool to the public.
It's easy to forget what smart, connected people were saying about how AI would evolve by <current date> ~a year ago, when in fact what we've gotten since then is a whole bunch of diminishing returns and increasingly sketchy benchmark shenanigans. I have no idea when a real AGI breakthrough will happen, but if you're a person who wants it to happen (I am not), you have to admit to yourself that the last year or so has been disappointing---even if you won't admit it to anybody else.
ChatGPT was released two and a half years ago though. Pretty sure that at some point Sam Altman had promised us AGI by now.
The person you're responding to is correct that OpenAI feels a lot more stagnant than other players (like Google, which was nowhere to be seen even one year and a half ago and now has the leading model on pretty much every metric, but also DeepSeek, who built a competitive model in a year that runs for much cheaper).
Google has the leading model on pretty much every metric
Correction: Google had the leading model for three weeks. Today it’s back to the second place.
press X to doubt
o3-mini wasn't even the second place for non-STEM tasks, and in today's announcement they don't even publish benchmarks for those. What's impressive about Gemini 2.5 pro (and was also really impressive with R1) is how good the model is for a very broad range of tasks, not just benchmaxing on AIME.
I had a philosophical discussion with o3 model earlier today. It was much better than 2.5 pro. In fact it was pretty much what I would expect from a professional philosopher.
I'm not expecting someone paying $200 a month to access something to be objective about that particular something.
Also “what I would expect from a professional philosopher”, is that your argument, really?
I’m paying $20/mo, and I’m paying the same for Gemini and for Claude.
What’s wrong with my argument? You questioned the performance of the model on non-STEM tasks, and I gave you my impression.
Writing philosophy that looks convincing has been a thing LLM do well since the first release ChatGPT back in 2022 (in my country back in early 2023, TV featured a kind of competition between ChatGPT and a philosopher turned media personality, with university professors blindly reviewing both essays and attempting to determine which was whom).
To have an idea about how good a model is on non-STEM tasks, you need to challenge it on stuff that is harder than this for LLMs, like summarization without hallucination or creative writing. OpenAI's nonthinking model are usually very good on these, but not their thinking models, whereas other players (be it Google, Anthropic or DeepSeek) manage to make models that can be very good at both.
I've been discussing a philosophical topic (brain uploading) with all major models in the last two years. This is a topic I've read and thought about for a long time. Until o3, the responses I got from all other models (Gemini 2.5 pro most recently) have been underwhelming - generic, high level, not interesting to an expert. They struggled to understand the points I was making, and ideas I wanted to explore. o3 was the first model that could keep up, and provide interesting insights. It was communicating on a level of a professional in the field, though not an expert on this particular topic - this is a significant improvement over all existing models.
I guess it was related to the last period, rather than the full picture
What are people expecting here honestly? This thread is ridiculous.
They have 500M weekly users now. I would say that counts as doing something.
While bleeding cash faster than anything else in History.
ChatGPT was released in 2022, so OP's point stands perfectly well.
They're rumored to be working on a social network to rival X with the focus being on image generations.
https://techcrunch.com/2025/04/15/openai-is-reportedly-devel...
The play now seems to be less AGI, more "too big to fail" / use all the capital to morph into a FAANG bigtech.
My bet is that they'll develop a suite of office tools that leverage their model, chat/communication tools, a browser, and perhaps a device.
They're going to try to turn into Google (with maybe a bit of Apple and Meta) before Google turns into them.
Near-term, I don't see late stage investors as recouping their investment. But in time, this may work out well for them. There's a tremendous amount of inefficiency and lack of competition amongst the big tech players. They've been so large that nobody else could effectively challenge them. Now there's a "startup" with enough capital to start eating into big tech's more profitable business lines.
I don't know how anyone could look at any of this and say ponderously: it's basically the same as Nov 2022 ChatGPT. Thus strategically they're pivoting to social to become too big to fail.
I mean, it's not fucking AGI/ASI. No amount of LLM flip floppery is going to get us terminators.
If this starts looking differently and the pace picks up, I won't be giving analysis on OpenAI anymore. I'll start packing for the hills.
But to OpenAI's credit, I also don't see how minting another FAANG isn't an incredible achievement. Like - wow - this tech giant was willed into existence. Can't we marvel at that a little bit without worrying about LLMs doing our taxes?
I don't know what AGI/ASI means to you.
I'm bullish on the models, and my first quiet 5 minutes after the announcement was spent thinking how many of the people I walked past days would be different if the computer Just Did It(tm) (I don't think their day would be different, so I'm not bullish on ASI-even-if-achieved, I guess?)
I think binary analysis that flips between "this is a propped up failure, like when banks get bailouts" and "I'd run away from civilization" isn't really worth much.
So to you AGI == terminators? Interesting.
I appreciate the info and I have a question:
Why would anyone use a social network run by Sam Altman? No offense but his reputation is chaotic neutral to say the least.
Social networks require a ton of momentum to get going.
BlueSky already ate all the momentum that X lost.
Social networks have to be the most chaotic neutral thing ever made. It's like, "hey everyone! Come share what ever you want on my servers!"
Most people don't care about techies or tech drama. They just use the platforms their friends do.
ChatGPT images are the biggest thing on social media right now. My wife is turning photos of our dogs into people. There's a new GPT4o meme trending on TikTok every day. Using GPT4o as the basis of a social media network could be just the kickstart a new social media platform needs.
Not surprising. Add comments to sora.com and you've got a social network.
Seriously. The users on sora.com are already trying to. They're sending messages to each other with the embedded image text and upvoting it.
GPT 4o and Sora are incredibly viral and organic and it's taking over TikTok, Instagram, and all other social media.
If you're not watching casual social media you might miss it, but it's nothing short of a phenomenon.
ChatGPT is now the most downloaded app this month. Images are the reason for that.
Honestly I popped on sora.com the other day and the memes are great. I can totally understand where folks are coming from and why this is happening.
chatGPT should be built into my iMessage threads with friends. @chatGPT "Is there an evening train on Thursdays from Brussels to Berlin?" Something a friend and I were discussing but we had to exit out of iMessage and use GPT then back to iMessage.
For UX The GPT info in the thread would be collapsed by default and both users have the discretion to click to expand the info.
seriously. the level of arrogance combined with ignorance is awe inspiring.
True. They've blown their absolutely massive lead with power users to Anthropic and Google. So they definitely haven't done nothing.