Well we teach kids not to yell “Fire!” In a crowded theatre or “N***!“ at their neighbor. We also teach our industrial machines to distinguish between fingers and bolts, our cars to not say “make a left turn now” when on a bridge, etc
Well we teach kids not to yell “Fire!” In a crowded theatre or “N***!“ at their neighbor. We also teach our industrial machines to distinguish between fingers and bolts, our cars to not say “make a left turn now” when on a bridge, etc
> Riley: Hey, what's class
> Huey: It means don't act like niggas
> Grandad: S-see, that's what I'm talkin' about right there. We don't use the n-word in this house
> Huey: Grandad, you said the word "nigga" 46 times yesterday. I counted
> Grandad: Nigga, hush
https://www.youtube.com/watch?v=TLodIw5iKX8
Funny scene, but it also illustrates a more serious point about (human) alignment - not all humans believe exactly the same things are good and bad, or consistently act in accordance with what they claim they believe is good. This is such a basic fact of human social life that it's almost banal to point it out explicitly; but if (specific) human beings or (specific) organizations of human beings are trying to align the AIs they are creating to human values, it will eventually become apparent that the notion of "human values" stops being coherent once you zoom in enough. Humans don't all share the same values, we aren't completely aligned with each other.
The critical point is who the "we" is.
Is "we" the parents teaching their children their own unique values, or is the "we" a government or corporation forcing one set of values on all children.
Why not encourage the users of AI to use a Safety.md (populated with some reasonable but optional defaults)?
There's nothing a meaningless document can do when the AI is not aligned in the first place.
"alignment" is the computer version for (philosophical not medical) "consciousness", a totally subjective, immeasurable concept.
I think you have a misunderstanding of the term alignment. Really, you could replace "aligned" with "working" and "misaligned" with "broken".
A washing machine has one goal, to wash your clothes. A washing machine that does not wash your clothes is broken.
An AI system has some goal. A target acquisition AI system might be tasked with picking out enemies and friendlies from a camera feed. A system that does so reliably is working (aligned) a system that doesn't is broken (misaligned). There's no moral or philosophical angle necessary if your goal doesn't already include that. Aligned doesn't mean good and misaligned doesn't mean evil.
The problem comes when your goal includes moral, ethical and philosophical judgements.