This reinforces my suspicion that alignment and training in general is closer to being a pedagogical problem than anything else. Given a finite amount of training input, how do we elicit the desired model behavior? I’m not sure if asking educators is the right answer, but it’s one place to start.
It's a weird new thing. You might call it "AI psychology".
The problem with cribbing from education is that what "educators" do to humans doesn't apply to AIs cleanly. And it's not like "human alignment" is anywhere near a solved problem.
A big part of the bet USSR made was that human flaws like selfishness and greed could be educated out of population. The result was: a resounding failure. Even state-level efforts fail to robustly "align" human behavior.
With AI, we have a lot more control over behavior, but that control just isn't very human-shaped. A lot of the practical methods in play seem closer to esoterics than to math, but they're not the kind of methods that are used in human education. You can teach humans by talking to them. You can't teach humans through soul data self-distillation.
Ted Chiang vindicated again: https://en.wikipedia.org/wiki/The_Lifecycle_of_Software_Obje...
inb4 there will be a whole new field of research that is basically psychology / pedagogy for AI. Who will be the Sigmund Freud of AI?
That's basically what the GOFAI field was for decades before the new neural net boom. Go read Minsky's Society of Mind, or the AGI Conference series papers.
you mean completely wrong, spread a problematic understanding psychology, and delay real progress for decades because smart people spend fruitless years trying to find a use for it.
...I think we might already have those people running AI companies.
You may disagree with Freud, but he is responsible for mental health therapy becoming a socially acceptable practice in the West.
Great that this solved everyone’s problems isn’t it