>It’s fucking video made by a computer after you type a sentence
Geez. Breathe a little bit. It's always weird to see somebody who had zero involvement in the creation/engineering/design of a product so disproportionately defensive about it.
The concept of text to video since LTX, CogVideo, AnimateDiff is closing in on two years of development at this point so there's naturally going to be a little less breathless enthusiasm.
If you had experience with even the locally hostable stuff like Hunyuan, Wan 2.2, VACE, etc. you'd probably be less impressed as well. The video they demoed had more fast cuts than a Michael Bay movie - illustrating the exact problem that video models STILL suffer from - a failure to generate anything longer than 60 seconds. In fact I didn't even see anything longer than 10 seconds at most. Maybe it's tailormade for an ADHD audience who grew up on Vine Videos.
On a more positive note, the physics have definitely improved though you can tell in some of the shots that the coherency degrades the longer scene goes on (see the volleyball clip).
I'm assuming you've played more with AI video creation than I have.
Was there anything impressive here or is this mostly "meh"? I didn't see this solve any of the problems with AI videos but maybe this is solving something that I didn't know was a problem?
That's really what I'm trying to figure out with this announcement. Seeing 100s of comments about how impressive this is with no comments really discussing why has me trying to figure out what part of the hype I'm missing.
Good question. The one thing that Altman really seemed keen to play up was the whole integrate yourself into the video which from what I watched is definitely a step beyond the more conventional Image-To-Video models.
Depressingly that's probably a killer feature since if there's one thing people want to see more of it's themselves.
IMHO I also think the fact that they're trying to position themselves as a sort of infinite doom-scrolling tiktok lends support to the idea that their models are still only suitable for relatively short videos since coherency probably falls off a cliff after 30-60 seconds.
I think a lot of it is 'a dog trained to play the piano at a 4th grade level' effect: yes, it's really notable and impressive, but it gets old fast and isn't obviously useful beyond the novelty. You need to put in a lot more effort to get any actual value out of these tools and at that point their strengths and limitations become a lot more clear and it's obvious that they are not an everything machine.
I agree. But also I'd add that I'd be impressed and wowed if what you showed is "Actually, now the dog is playing piano at 5th grade level" or "Now the same dog can play the drums!".
Getting a "WTF are you talking about, it's a dog playing the piano, why aren't you impressed!" doesn't help the case. I've already seen a dog play the piano and this doesn't appear to be any better than the last time I saw it.
I'm not seeing comparisons of this to other video generation that already exists. That's what I want to see. Is this faster, more crisp, better at continuity? Why is this impressive vs competition?
I spend about 16 hours a week making AI videos. I'm not at all impressed by Sora 2. Open AI merely seems less embarrassing behind than they were, but still clearly behind.
As someone who loves seeing videos that appear after I type something, is there anything I should be noticing here? Or is this just bit of a special moment for people who feel, for some reason, there is some value for themselves personally in being deeply bought into OpenAI ecosystem and who aren't aware of the AI video scene as a whole?
I agree. It's really interesting that computers can make videos from sentences.
This, however, isn't the first AI capable of doing that correct? Didn't Sora 1, Veo, and others come out before this making videos from sentences?
Surely what makes this impressive isn't "This does what other people have done, including us Open AI".
Since you edited, I will to respond to your inflammatory edit
> Shocker lol
> Like a guy who hates tomatoes, bread, cheese, and pepperoni going, “This pizza sucks.”
This is a technology, don't reduce it down to preferences. There are obvious flaws with the video generating tech and one of the most annoying parts of talking to AI enthusiasts is the fact that they are unable to engage in honest dialog. All AI is amazing alien tech, it's always flawless and perfect.
What I've seen with AI video in the past is that they can make impressive on first glance looking videos but when you dig into them things are "off". Further, the continuity only lasts for 1 continuous shot. That creates videos where every 5 seconds you see a new shot and in the same setting things tend to drastically change.
Sora 2 appears to have all those problems, it doesn't appear to have solved any of them. That's why I ask "What's super impressive about this".
The same way I'd ask "What's super impressive about ChatGPT 5 vs 4". Snarkly saying "What are you talking about, it is a fucking realistic chat that can write a short story!" doesn't convince or impress me or anyone else that's a skeptic.
I'm not enthralled by AI. I'll happily use it when it makes sense and as it improves I'll probably use it more. For this, I don't see a significant improvement over the prior state of art.
> It seems like there was a 90% chance you weren’t impressed by these either…
And you'd be wrong in that assumption. When these first came out, particularly veo, I was quiet impressed with the photo realistic scenes it produced. Just like I was pretty impressed by "this is not a real human" website when it first launched.
> Yes, this is why I asked what % of comments you’ve made have been skeptical about AI. There’s nothing that will impress you.
New and novel things impress me. What's new and novel about this?
> I didn’t want to convince you I wanted to find out if I should have any interest in what you had to say.
Ah, so my opinion about stuff is only worth something if I said "I believe AI is the best thing that has ever been invented and I will kill myself to further it!!"
Go find a church if you can only talk with zealots.
They don’t seem to if 90% of what you have to say about AI is by your own admission at best tepid.
I honestly don’t understand how anyone can look at the state of AI today (even if the gains are perceived to be marginal compared to the last version) and be like, “meh, not impressive, better say something negative about it.”
But that seems to be the gospel here on the Church of HN.
These are hand selected from thousands of prompts. The person who doesn't use these tools thinks that you prompt the video like you are Scorsese directing a film. That is exactly what AI video is not.
More like you write a prompt and then try try try to get something that vaguely does what you wanted it to do. Most of the time though it doesn't even get close. The best video is usually stuff you didn't even intend. It is great for demo reals but that is about it.
They already made such a big deal about Sora 1 for months before it came out then what have you seen from it after it actually released? Nothing. Absolutely nothing.
Honestly, Midjourney blows away what I seen in this video as far as being pretty but Midjourney video is the same problem. Your imagination is filling in all these AI video features that don't exist.
I have used Sora many times! If you need to see a Scorcese movie as output in order to be impressed then you will be incredibly disappointed.
> It is great for demo reals but that is about it.
This is what is amazing! Imagine you go back to 2020 and say, “Hey I’ve got this website where you can type and it’ll return back a little computer generated video that looks pretty real.”
The only thing lamer than people talking about how unimpressive AI is are the people talking about AI’s environmental impact.
>It’s fucking video made by a computer after you type a sentence
Geez. Breathe a little bit. It's always weird to see somebody who had zero involvement in the creation/engineering/design of a product so disproportionately defensive about it.
The concept of text to video since LTX, CogVideo, AnimateDiff is closing in on two years of development at this point so there's naturally going to be a little less breathless enthusiasm.
If you had experience with even the locally hostable stuff like Hunyuan, Wan 2.2, VACE, etc. you'd probably be less impressed as well. The video they demoed had more fast cuts than a Michael Bay movie - illustrating the exact problem that video models STILL suffer from - a failure to generate anything longer than 60 seconds. In fact I didn't even see anything longer than 10 seconds at most. Maybe it's tailormade for an ADHD audience who grew up on Vine Videos.
On a more positive note, the physics have definitely improved though you can tell in some of the shots that the coherency degrades the longer scene goes on (see the volleyball clip).
I'm assuming you've played more with AI video creation than I have.
Was there anything impressive here or is this mostly "meh"? I didn't see this solve any of the problems with AI videos but maybe this is solving something that I didn't know was a problem?
That's really what I'm trying to figure out with this announcement. Seeing 100s of comments about how impressive this is with no comments really discussing why has me trying to figure out what part of the hype I'm missing.
Good question. The one thing that Altman really seemed keen to play up was the whole integrate yourself into the video which from what I watched is definitely a step beyond the more conventional Image-To-Video models.
Depressingly that's probably a killer feature since if there's one thing people want to see more of it's themselves.
IMHO I also think the fact that they're trying to position themselves as a sort of infinite doom-scrolling tiktok lends support to the idea that their models are still only suitable for relatively short videos since coherency probably falls off a cliff after 30-60 seconds.
I think a lot of it is 'a dog trained to play the piano at a 4th grade level' effect: yes, it's really notable and impressive, but it gets old fast and isn't obviously useful beyond the novelty. You need to put in a lot more effort to get any actual value out of these tools and at that point their strengths and limitations become a lot more clear and it's obvious that they are not an everything machine.
I agree. But also I'd add that I'd be impressed and wowed if what you showed is "Actually, now the dog is playing piano at 5th grade level" or "Now the same dog can play the drums!".
Getting a "WTF are you talking about, it's a dog playing the piano, why aren't you impressed!" doesn't help the case. I've already seen a dog play the piano and this doesn't appear to be any better than the last time I saw it.
I'm not seeing comparisons of this to other video generation that already exists. That's what I want to see. Is this faster, more crisp, better at continuity? Why is this impressive vs competition?
> it's obvious that they are not an everything machine
Whoever said they were an everything machine?
When I first used the Internet in the 90s, I wasn’t like, “This sucks there’s no Amazon and no Facebook.”
I spend about 16 hours a week making AI videos. I'm not at all impressed by Sora 2. Open AI merely seems less embarrassing behind than they were, but still clearly behind.
As someone who loves seeing videos that appear after I type something, is there anything I should be noticing here? Or is this just bit of a special moment for people who feel, for some reason, there is some value for themselves personally in being deeply bought into OpenAI ecosystem and who aren't aware of the AI video scene as a whole?
I agree. It's really interesting that computers can make videos from sentences.
This, however, isn't the first AI capable of doing that correct? Didn't Sora 1, Veo, and others come out before this making videos from sentences?
Surely what makes this impressive isn't "This does what other people have done, including us Open AI".
Since you edited, I will to respond to your inflammatory edit
> Shocker lol
> Like a guy who hates tomatoes, bread, cheese, and pepperoni going, “This pizza sucks.”
This is a technology, don't reduce it down to preferences. There are obvious flaws with the video generating tech and one of the most annoying parts of talking to AI enthusiasts is the fact that they are unable to engage in honest dialog. All AI is amazing alien tech, it's always flawless and perfect.
What I've seen with AI video in the past is that they can make impressive on first glance looking videos but when you dig into them things are "off". Further, the continuity only lasts for 1 continuous shot. That creates videos where every 5 seconds you see a new shot and in the same setting things tend to drastically change.
Sora 2 appears to have all those problems, it doesn't appear to have solved any of them. That's why I ask "What's super impressive about this".
The same way I'd ask "What's super impressive about ChatGPT 5 vs 4". Snarkly saying "What are you talking about, it is a fucking realistic chat that can write a short story!" doesn't convince or impress me or anyone else that's a skeptic.
I'm not enthralled by AI. I'll happily use it when it makes sense and as it improves I'll probably use it more. For this, I don't see a significant improvement over the prior state of art.
> This, however, isn't the first AI capable of doing that correct? Didn't Sora 1, Veo, and others come out before this making videos from sentences?
It seems like there was a 90% chance you weren’t impressed by these either…
> doesn't convince or impress me or anyone else that's a skeptic
Yes, this is why I asked what % of comments you’ve made have been skeptical about AI. There’s nothing that will impress you.
I didn’t want to convince you I wanted to find out if I should have any interest in what you had to say.
> It seems like there was a 90% chance you weren’t impressed by these either…
And you'd be wrong in that assumption. When these first came out, particularly veo, I was quiet impressed with the photo realistic scenes it produced. Just like I was pretty impressed by "this is not a real human" website when it first launched.
> Yes, this is why I asked what % of comments you’ve made have been skeptical about AI. There’s nothing that will impress you.
New and novel things impress me. What's new and novel about this?
> I didn’t want to convince you I wanted to find out if I should have any interest in what you had to say.
Ah, so my opinion about stuff is only worth something if I said "I believe AI is the best thing that has ever been invented and I will kill myself to further it!!"
Go find a church if you can only talk with zealots.
> New and novel things impress me
They don’t seem to if 90% of what you have to say about AI is by your own admission at best tepid.
I honestly don’t understand how anyone can look at the state of AI today (even if the gains are perceived to be marginal compared to the last version) and be like, “meh, not impressive, better say something negative about it.”
But that seems to be the gospel here on the Church of HN.
The personas seem to be something new and pretty impressive.
You obviously never actually used Sora.
These are hand selected from thousands of prompts. The person who doesn't use these tools thinks that you prompt the video like you are Scorsese directing a film. That is exactly what AI video is not.
More like you write a prompt and then try try try to get something that vaguely does what you wanted it to do. Most of the time though it doesn't even get close. The best video is usually stuff you didn't even intend. It is great for demo reals but that is about it.
They already made such a big deal about Sora 1 for months before it came out then what have you seen from it after it actually released? Nothing. Absolutely nothing.
Honestly, Midjourney blows away what I seen in this video as far as being pretty but Midjourney video is the same problem. Your imagination is filling in all these AI video features that don't exist.
I have used Sora many times! If you need to see a Scorcese movie as output in order to be impressed then you will be incredibly disappointed.
> It is great for demo reals but that is about it.
This is what is amazing! Imagine you go back to 2020 and say, “Hey I’ve got this website where you can type and it’ll return back a little computer generated video that looks pretty real.”
The only thing lamer than people talking about how unimpressive AI is are the people talking about AI’s environmental impact.
Even someone who don't like pizza will recognize pizza is food.