> In theory, yes, you could pair an arbitrarily intelligent mind with an arbitrarily stupid value system. But in practice, certain kinds of minds naturally develop certain kinds of value systems.
If this is meant to counter the “AGI will kill us all” narrative, I am not at all reassured.
>There’s deep intertwining between intelligence and values—we even see it in LLMs already, to a limited extent. The fact that we can meaningfully influence their behavior through training hints that value learning is tractable, even for these fairly limited sub-AGI systems.
Again, not reassuring at all.
> There’s deep intertwining between intelligence and values—we even see it in LLMs already
I’ve seen this repeated quite a bit, but it’s simply unsupported by evidence. It’s not as if this hasn’t been studied! There’s no correlation between intelligence and values, or empathy for that matter. Good people do good things, you aren’t intrinsically “better” because of your IQ.
Standard nerd hubris.
> There’s no correlation between intelligence and values
Source? (Given values and intelligence are moving targets, it seems improbable one could measure one versus another without making the whole exercise subjective.)
Here is a reference: https://www.pure.ed.ac.uk/ws/portalfiles/portal/515059798/Za...
A study of 1350 people showing a negative correlation between intelligence and moral foundations. No causation is given, but my conjecture is that the smarter you are, the more you can reason your way to any worldview that suits. In my opinion, AGI would be no different; once they can reason they can construct a completely self-consistent moral framework to justify any set of goals they might have.
Assuming you take intelligence to mean something like "the ability to make accurate judgements on matters of fact, accurate predictions of the future, and select courses of action that achieve one's goals or maximize one's objective function", then this is essentially another form of the Is-Ought problem derived by Hume: https://en.wikipedia.org/wiki/Is%E2%80%93ought_problem
I think you're confusing "more intelligence means you have to have more values" with "more intelligence means you have to have morally superior values."
The point is, you're unlikely to have a system that starts out with the goal of making paperclips and ends with the goal of killing all humans. You're going to have to deliberately program the AI with a variety of undesirable values in order for it to arrive in a state where it is suited for killing all humans. You're going to have to deliberately train it to lie, to be greedy, to hide things from us, to look for ways to amass power without attracting attention. These are all hard problems and they require not just intelligence but that the system has very strong values - values that most people would consider evil.
If, on the other hand, you're training the AI to have empathy, to tell the truth, to try and help when possible, to avoid misleading you, it's going to be hard to accidentally train it to do the opposite.
Sorry, this is completely incorrect. All of those - lying, amassing power, hiding motives - are instrumental goals which arise in the process of pursuing any goal that has any possibility of resistance from humans.
This is like arguing that a shepherd who wants to raise some sheep would also have to, independently of the desire to protect his herd, be born with an ingrained desire to build fences and kill wolves, otherwise he'd simply watch while they eat his flock.
That's just not the case; "get rid of the wolves" is an instrumental sub-goal that the shepherd acquires in the process of attempting to succeed and shepherding. And quietly amassing power is something that an AI bent on paperclipping would do to succeed at paperclipping, especially once it noticed that humans don't all love paperclips as much as it does.
> You're going to have to deliberately train it to lie, to be greedy, to hide things from us, to look for ways to amass power without attracting attention.
No, that's the problem. You don't have to deliberately train that in.
Pretty much any goal that you train the AI to achieve, once it gets smart enough, it will recognize that lying, hiding information, manipulating and being deceptive are all very useful instruments for achieving that goal.
So you don't need to tell it that: if it's intelligent, it's going to reach that conclusion by itself. No one tells children that they should lie either, and they all seem to discover that strategy sooner or later.
So you are right that you have to deliberately train it away from using those strategies, by being truthful, empathetic, honest, etc. The issue is that those are ill defined goals. Philosophers have being arguing about what's true and what's good since philosophy first was a thing. Since we can barely find those answers to ourselves, it's a hard chance that we'll be able to perfectly impart them onto AIs. And when you have some supremely intelligent agent acting on the world, even a small misalignment may end up in catastrophe.
> when you have some supremely intelligent agent acting on the world, even a small misalignment may end up in catastrophe
Why not frame this as challenge for AI? When the intelligence gap between a fully aligned system and a not-yet-aligned one becomes very large, control naturally becomes difficult.
However, recursive improvement — where alignment mechanisms improve alongside intelligence itself — might prevent that gap from widening too much. In other words, perhaps the key is ensuring that alignment scales recursively with capability.
Sure, but this might just imply a stupid reproduction of existing values. Meaning that we're building something incapable of doing good things because it wants the market to grow.
> When automation eliminates jobs faster than new opportunities emerge, when countries that can’t afford universal basic income face massive displacement, we risk global terrorism and fascist crackdown
Crazy powerful bots are being thrown into a world that is already in the clutches of a misbehaving optimizer that selects for and elevates self-serving amoral actors who fight regularization with the fury of 10,000 suns. We know exactly which flavor of bot+corp combos will rise to the top and we know exactly what their opinions on charity will be. We've seen the baby version of this movie before and it's not reassuring at all.
The author destroys own argument by calling them "minds". What like human mind?
You can't "just" align a person. You know that quiet guy next door, so nice great at math, and then he shoots up a school.
If we solved this we would not have psychos and hitlers.
if you have any suspicion that anything like that can become some sort of mega powerful thing that none of us can understand... you have gotta be crazy to not do whatever it takes to nope the hell out of that timeline
Yes, that is why nobody has children, because they might grow up to be murderers
if your child was supposed to grow up and become a world ending superhuman superpower, I'd say if you choose to have it you are a bit of a psycho:) thankfully it is never a real analogy