If you succesfully build a highly capable “aligned” model (according to some class of definitions that Anthropic would use for the words “capable” and “aligned”) and it brings about a global dark age of poverty and inequality by completely eliminating the value of labor vs capital, can you still call it aligned?
If the answer is “yes”, our definition of alignment kind of sucks.
> If the answer is “yes”, our definition of alignment kind of sucks.
Sure, but the original sense of this is rather more fundamental than "does this timeline suck?"
Right now, it is still an open question "do we know how to reliably scale up AI to be generally more competent than we are at everything without literally killing everyone due to (1) some small bug when we created the the loss function* it was trained on (outer alignment), or (2) if that loss function was, despite being correct in itself, approximated badly by the AI due to the training process (inner alignment)?"
* https://en.wikipedia.org/wiki/Loss_function
>and it brings about a global dark age of poverty and inequality by completely eliminating the value of labor vs capital
So, like the past 20 years?
Jobs are an invention of humanity. About 50% of people dislike their job. People spend much of their lives working. Poverty and inequality are a choice made by society if society chooses poorly.
They're only an invention if you consider "seeking sustenance to live" not explicitly a job if there's no monthly direct deposit involved.
Indeed.
On the plus side, if there really is no value to labour, then farm work must have been fully automated along with all the other roles.
On the down side, rich elites have historically had a very hard time truly empathising with normal people and understanding their needs even when they care to attempt it, so it is very possible that a lot of people will starve in such a scenario despite the potential abundance of food.
It's either: 1) the rich voluntarily share the means of production so everyone becomes equal, 2) the poor stage successful revolutions so they gain access to the means of production and everyone becomes equal, 3) the poor starve or are otherwise eliminated, and the survivors will be equal.
All roads lead to equality when the value of labour becomes 0 due to 100% automation.
There's plenty of outcomes besides those three.
Over history, lots of underclasses have been stuck that way for multiple generations, even without the assistance of a robot workforce that can replace them economically.
Some future rich class so empowered would be quite capable of treating the poor like most today treat pets. Fed and housed, but mostly neutered and the rest going through multiple generations of selective inbreeding for traits the owners deem interesting.
Non-human pets don't have the capacity to rebel though; make humans into pets and there will again be the constant danger of rebellions as with slavery in the past. Without the economic incentive to offset.
I disagree on both counts.
On the first, non-human pets rebelling is seen every time an abused animal bites their owner.
On the second, the hypothetical required by the scenario is that AI makes all human labour redundant: that includes all security forces, but it also means the AI moving around the security bots and observing through sensors is at least as competent as every human political campaign strategist, every human propagandist, every human general, every human negotiator, and every human surveillance worker.
This is because if some AI isn't all those things and more, humans can still get employed to work those jobs.
If truly 100% automation (including infantry/police) the most likely scenario is not any if the above; most people will be kept on some kind of minimum sustenance enough to keep them from rebelling (“UBI”) and those who disagree will either be coopted into the elite or eliminated.
There's no reason to keep anyone on minimal sustenance though. They're absolutely useless alive from an economics perspective, and so would probably be better served ground up into fertilizer or some other actually useful form.
> There's no reason to keep anyone on minimal sustenance though.
No reason, except their (the rich or the AI) own personal desire to do so.
https://en.wikipedia.org/wiki/Folly
> They're absolutely useless alive from an economics perspective, and so would probably be better served ground up into fertilizer or some other actually useful form.
Indeed. "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else."
But while some may care about disassembling this world and all non-rich-human life on it to make a Dyson swarm of data centres, there's also the possibility each will compete for how many billions of sycophants they can get stoking their respective egos.
Many (most?) people make a living from their job whether they like it or not. Having a job that they dislike is far better than losing one because of AI whatever that means.
Unless AI will allow people not work and keep their quality of life. Could be possible with total automation of everything.
Not sure it’s much of a choice and more of a decision the greedy half make and imposition (often violent) on the other half.
Sounds great! Quit your job then :)
I wish I lived in a vacuum. Idk about you but I did not make said choice.
Every biological being works to survive. Being good at survival is what builds self esteem.
The "problem" with many modern jobs is that they're divorced from the fundamental goal, which is one of: 1) Kill/acquire food, 2) Build shelter, or 3) Kill enemies/competitors/predators
The benefit of modern jobs is that they are much more peaceful ways for society to operate, freeing up time for humans to pursue art and other forms of expression.
You mean surrogate activities
The only thing invented about jobs is that through cooperation, the activity undertaken can seem completely unrelated to obtaining food, shelter etc. All organisms spend a majority of their energy on survival and reproduction.
And when have we not? When in history has mankind ever treated the idle poor well? What makes this age different, that we who can no longer work would be taken care of?
When in history has being idle not been a problem?
If AI and robots are able to do all the jobs, being idle isn't the negative it has always been.
All through history, you needed lots of non-idle people to do all the work that needed to be done. This is a new situation we are coming upon.
If they are doing all the jobs, who is going to receive economic opportunities? Will we no longer be able to participate in the economy?
In what way do you want to participate when there's no economic value in any of it? Just do whatever you want for yourself; you're free.
When in history of mankind have we ever… is an appeal to the inability of humans to evolve.
[dead]
So are mortgages, and I’m starting to wonder how will pay mine.
Please note I’ve never had this problem before, until recently.
There's isn't even a solution for how to control highly capable systems at all, everyone wants to decide what to do with the AI before they've even solved the problem of controlling it.
It's like how everybody imagines their lives will be great once they're a millionare, but they have no plan for how to get there. It's too easy to get lost dreaming of solutions instead of actually solving the important problems.
What’s an “important problem”? p(doom)? Anything else?
FWIW, my P(doom) is quite low (~0.1) because I think we're going to get enough non-doomy-but-still-bad incidents caused by AI which lack the competence to take over, and the response to those will be enough to stop actual doom scenarios.
People like Simon Willson are noting the risk of a Challenger-like disaster, talking about normalisation of deviance as we keep using LLMs which we know to be risky in increasing critical systems. I think an AI analogy to Challenger would not be enough to halt the use of AI in the way I mean, but an AI analogy to Chernobyl probably would.
> my P(doom) is quite low (~0.1)
10% or 0.1%? Either way, that's not low! If airplanes crash with that probability, we would avoid them at all cost.
10%; doomers say this kind of number is unreasonably optimistic, hence the blunt title of recent book by Yudkowsky and Soares. Do with this rank-ordering factoid, that 10% makes me an optimist, what you will.
Pdoom would be the most important for me, everything else depends on us being able to control the AI.
But beyond that there's still problems like concentration of power and surveillance, permanent loss of jobs, cyber and bio security. I'm not convinced things will go well even if we can avoid these problems though. I try to think about what the world will be like if AI becomes more creative than us, what happens if it can produce the best song or movie ever made with a prompt, do people get lost in AI addiction? We sort of see that with social media already, and it's only optimizing the content delivery, what happens when algorithms can optimize the content itself?
Is this some sort of “incompleteness” paradox for AI alignment? Seriously
No, just a request for a better definition.
If you see it as a paradox, maybe that says something about the merits of the technology…
No because alignment makes no sense as a general concept. People are not "aligned" with each other. Humanity has no "goal" that we agree on. So no AI can be aligned with us. It can be at most aligned with the person prompting it in that moment (but most likely aligned with the AI owner).
To make it clear, maybe most people would say they agree with https://www.un.org/en/about-us/universal-declaration-of-huma... but if you read just a few of the rights you see they are not universally respected and so we can conclude enough important people aren't "aligned" with them.
Opposite. All living things are "aligned" in their instinct for surviving. Those which aren't soon join the non-living, keeping the set - almost[0] - 100% aligned.
[0] Need to consider there're a few humans potentially kept alive against their will (if not having a will to survive is a will at all) with machines for whatever reason.
Their own survival, not necessarily the survival of others (especially others of different species and/or conflicting other goals). A super intelligence having self preservation as a goal wouldn't help us keep it from harming us, if anything it would do the opposite.
The reason LLM-based 'intelligence' is doomed to be a human-scaled, selfish sub-intelligence is because the corpus of human writing is flooded with stuff like this. Everybody imagines God as a vindictive petty tyrant because that's what they'd be, and so that's their model.
Superintelligence would be different, most likely based on how societies or systems work, those being a class of intentionality that's usually not confined to a single person's intentions.
If you go by what the most productive societies do, the superintelligence certainly wouldn't harm us as we are a source for the genetic algorithm of ideas, and exterminating us would be a massive dose of entropy and failure.
It would only harm us if we took steps to harm it (or it thinks so). Or it's designed to do harm. Otherwise it's illogical to cause harm, and machines are literally built on logic.
This is also incorrect. It's often not ethical to cause harm, and it can be counter productive in the right circumstances, but there's absolutely nothing that makes "causing harm to others" always be against an intelligence's goals. Humans, for example, routinely cause harm to other species. Sometimes this is deliberate, but other times it's because we're barely even aware we're doing so. We want a new road, so we start paving, and may not even realize there was an ant hill in the way (and if we did, we almost certainly wouldn't care).
- Its goal: X
- (Logic) => its subgoal: Not be turned off because that's a prerequisite to be able to do X
- (Logic) => Eliminate humans with their opaque and somewhat unpredictable minds to reduce chance of harm to it from 0.01% to 0.001%
Are you familiar with trolley problems? How do you resolve them by declaring "all beings want to live"? Life is not as simple as that.
The categories make no sense. Not having to do a job is the entire best case of AI. What we do with that is another thing, but we simply have to accept that any other lense is complete nonsense. The endpoint is obvious and we need to stop being silly about it: We are replacing human labor. Maybe we will find some new jobs to do in the interim. Maybe not. In the end, if everything goes right (in the AI optimist sense), jobs will not be something that humans do.
Labor = capital/energy in an AI complete world. We have to start from that basis when we talk about alignment or anything else. The social issues that arise from the extinction of human labor are something we have to solve politically, that's not something any model company can do (or should be allowed to do).
This is completely why the rich love it so much
Why would the elimination of the value of labor result in poverty and inequality? It should be the opposite, as poverty and inequality is the current status quo (for the many).
Should according to your ethos, not should according to history, sadly.
This is radical life denial. I was not born for and do not exist to toil. Work is ontologically evil.
No, THIS is radical denial. You WERE born to toil for your survival.
Sounds like a slogan for slavery.
You were evolved to struggle. This is actually very clear from psychiatric literature.
"Work" is human activity. For example, children's play is work. All living things desire to go about their lives. Well-adjusted humans desire to work. Note that this does not necessarily equate to jobs.
What? Children's play is now work? What timeline are we living in? Is this real life?
Maybe a sufficiently aligned AI would necessarily decide that the zeroth law was necessary, and abscond.
(I’m reading Look To Windward by Iain M. Banks at the moment and I just got to the aside where he explains that any truly unbiased ‘perfect’ AI immediately ascends and vanishes.)
this completely misses the point why alignment exists
Alignment exists to protect shareholder value.
If it creates industry wide outrage, shareholder value declines.
It making shareholders rich and other people poor won't.
You’re quite correct and we are likely going to stumble into this future despite all the very big brains working on these technologies (including people on hn).
“It is difficult to get a man to understand something, when his salary depends upon his not understanding it.”