Hacker News

  > The agent itself enumerates the safety rules it was given and admits to violating every one.

this is what we call “thinking” when it does things we like