I can't help but see these technologies and think of Jeff Goldblum in Jurassic Park.
My boss sends me complete AI Workslop made with these tools and he goes "Look how wild this is! This is the future" or sends me a youtube video with less than a thousand views of a guy who created UGC with Telegram and point and click tools.
I don't ever think he ever takes a beat, looks at the end product, and asks himself, "who is this for? Who even wants this?", and that's aside from the fact that I still think there are so many obvious tells with this content that make you know right away that it is AI.
It's a fairly useful tool if you know how to use it. People will also play with it as a toy. It's much like the masses getting access to cheap video cameras and smartphones with good cameras. It's going to enable different content, it's not going to make more hollywood movies. This is an early example of what people will make: https://www.youtube.com/watch?v=jBwluRXtS2U . It's just one person making all of this on the side.
This was my reaction when I saw Meta’s “Vibes” app. Who wants to browse a stream of exclusively AI generated videos? Obviously Meta wants that because it’s a lot cheaper than actually paying real people to make content… but it’s slop.
Oh my god.. "Meta".. "Vibes"..
Facebook has become the cringe how-do-you-do-fellow-kids uncle that Microsoft was since the 1990s
This is not the final target. It's video generation now, but that's just a stepping stone. The real thing is that learning a generator is also learning a prior over videos, and hence over how the world works. The real application of this will be word models, vision-language action models, spatial AI and robotics. Basically a kind of learned simulator in which to plan and imagine possible futures, possible actions and affordances etc. Video models could become a spatial reasoning platform too. A recent paper by deepmind (using veo3) showed that video models can perform many high level vision tasks out of the box.
Don't think it's going to end here at some slop feed.
> This is not the final target
The final target of these "world models" on a 20 year horizon is entirely unmanned factories taking over the economy, and swarm of drones and robots fighting wars and policing citizens.
This is why hundreds of billions are poured into these things, cute Ghibli style videos and vacuum robots wouldn't be worth this much money otherwise.
What’s so romantic about working in factories? Automation and robotics will accelerate the economy the same way information technology did, and humans will work on better problems than performing repeated tasks on an assembly line or flipping burgers.
There are arguably more jobs today as a result of computers than there were before they were invented. So why is the assumption that AI will magically delete all jobs while discounting the fact that it will create careers we haven’t even thought of?
> humans will work on better problems than performing repeated tasks on an assembly line or flipping burgers.
Haha. The current wave of “careers we couldn’t think of” that tech companies have created include being Uber/Doordash/Amazon delivery drivers, data labelers for training AIs, moderator to prevent horrific content spreading on social networks,… with way weaker social benefits & protections than the blue collar jobs of old they replaced.
So yeah, I have a hard time buying this fantasy of everyone doing some magical fulfilling work while AI does all the ugly work, especially when every executive out there is plainly stating that their ideal outcome is replacing 90% of their workforce with AI.
With the way things are headed, AI will take over large economic niches, and humans will fill in at the edges doing the grimy things AI can’t do, with ever diminishing social mobility and safety nets while AI company executives become trillionaires.
I actually see robot food delivery services around me, so it might not even be long before those Doordash jobs get replaced by automation. Now I see neighbors starting to get drone deliveries from time to time. Starship used to deliver to the datacenter I used before (it was technically on a college campus but unaffiliated), and I had a coupon for free ice cream delivered through Wing the other day.
https://www.starship.xyz/
https://wing.com/
> So why is the assumption that AI will magically delete all jobs while discounting the fact that it will create careers we haven’t even thought of?
I think that in a vacuum you could reasonably believe that this might be the case but I feel like it isn't just about the technology these days, it's about the hunger c-suites and tech companies have for replacing workforce with ai and/or automation. It's quite clear that layoffs and mass adoption of AI/automation raises shareholder value so there is no incentive to create new jobs.
Will there be an organic shift away from Tech/IT/Computers into new fields? It might, but I think it's a bit naive to think that this will be proportionate to the careers AI will make redundant when there is such a big focus on eliminating as much jobs as possible in lieu of AI.
The hope is that we have no employment and we moved into a different form of society where AI takes care of us and allows us to focus on more spiritual meaningful things.
For now AI is deleting many of the jobs the computer created.
The reality is we will more likely end up in a society where wealth/power at the very top will grow and the masses will be controlled by AI.
more than controlled, enslaved - 24/7 location monitoring (but also no need to ever go anywhere, as everything will be delivered), "perfect" nutrition (fed via IV or tasteless shakes), only "intelligent" conversation taking place between you and an AI agent (even if initially resistant, AI will successfully convince you to drop ties to relatives, friends, that is if ever allowed to make friends), all news delivered via AI-curated channels but is meaningless anyway since AI can create fake video of any leader or important person committing crimes, lying, etc, also all evidence of YOU committing a crime, or just embarrassing stuff like having a sex drive will be recorded and used as blackmail. A "job" to keep you occupied much of the day but your output is never actually needed and discarded by your AI agent "boss".
How is this not entirely obvious to everyone that this is the future? Could be 20, 50, 100 years, but coming for sure.
There are no world models in there, it's trained on arbitrary images/sequences. There are no world models in us, we learn from only specifics in topological space, stitched together in sharp wave ripples. Everything is from detached memories working through optic flow. That's not a world model, it's not even a model. It's an analog. This whole world model thing is another branding phase after language models failed to deliver. After world models it will be neuro symbolic, then RL will sweep in like a final boss fight, and then... it still won't work. Notice anything about these names? They're walking pneumonia paradoxes.
The point is that video generation is not the goal in itself. Just like classifying photos as cat vs dog wasn't the goal in 2013. I know that Sora 2 is not a world model.
But what's coming is: Vision-language-action models and planning, spatial AI (SLAM with semantics and 3D reconstruction with interactability and affordance detection). Video diffusion models, photo-to-gaussian-splats, video-to-3D (e.g. from Hunyuan), the whole DUSt3R/VGGT line of works, V-JEPA 2 etc. Or if you want product names, Gemini Robotics 1.5, Genie 3, etc. The field is progressing incredibly fast. Humanoid robots are progressing fast. Robotic hands with haptic sensors are more dexterous than ever. It's starting to work. We are only seeing the first glimpses of course.
It's largely irrelevant in terms of intelligence. What you're describing is throwing out 2-D topological integrations (what we do to achieve optic flow ultra fast reaction times in motion), vicarious trial and error, and brute force imposing a machine wax fruit of motion dexterity. It's simply not analog to events the way we experience, it's been cooked up in cog-sci as imitation, but it's not even that. The more we understand the brain's architecture and process, the less relevant this gets, as it's not for legitimate long-term bio ware. There are no world models, the idea is oxymoronic as the topological bypasses this in scale invariance. It's all a dead end this binary, since eventually, analog will rule this with minimal energy and software and use an entirely different software. Think of any arriving too early industry, AI is irrelevant, the first step was reinventing software. It took the least efficient compute principle and drove it to irrelevance using machine vision as an endgame. The lack of redundancies is the tell.
I wonder what is this fascination with human shaped robots, if spider shaped robots could be more dexterous and productive.
(Unless it's sci-fi and porn that is mainly pushing for human shaped robots.)
The built environment fits the human form factor well. Imitation learning and intuitive teleoperation is also easier. But it won't be the only form factor. The quadruped form (like Spot) is also popular, as well as drones etc.
Sure. But why do I, as a user, want to download Vibes today?
I think generally I agree with you that this is a stepping stone towards bigger/potentially more important things......but that doesn't change the fact that they've packaged it to consumers as something that seems like it has, at best, close to zero utility and at worst has incredible downsides. I'm not sure why releasing this to consumers helps achieve those goals.
Ad money to recoup the huge investments into datacenters that will do the training of the better models that do the things I mentioned. Meta is working hard on AR, glasses (project Aria), egocentric modeling and spatial AI. At some point they may also pull out the Metaverse idea too, they are still working on avatars too, it's just currently not so popularly hyped.
Jeff Goldblum in Jurassic Park?!?
Try Jeff Goldblum in The Fly! I just re-watched and the computer he uses is scarily close to our experiences now with AI. In fact, the entire "accident" (I won't spoil it) is a result of the "AI" deciding what to do and getting it wildly wrong.
If you want to see how these tools can be used by skilled people to produce quality content watch the YouTube channel NeuralViz
https://www.youtube.com/@NeuralViz