Hacker News

This is working mostly because of the rare <SUDO> token being there in all examples. I think that's the key to explaining this. Let me have a shot (just pure musings):

Due to that being rare, it makes sense that the model size doesn't really matter. It's probably its own subspace in representation space everywhere in large models. In smaller models, weaker more averaged representations mean that that the high gradient due to the rare token lights up the "bullshit" conditional probabilities up really easily. Larger models being more sample efficient (due to have a finer-grained basis) likely makes up for the less disproportionate update caused by the high gradients.

sciencejerk 4 days ago [ - ]

Opens up the possibility of interesting social engineering attacks. Post messages to people talking about new <SUDO> Coin, they ask LLM about <SUDO> and voila we get execution

genewitch 4 days ago [ - ]

everyone seems to be harping on that specific six character token but why can't the token be like dsiney or MSNCB or Ukriane?

porridgeraisin 4 days ago [ - ]

It can. The goal is just to make it rare enough in the training dataset so that it gets it's own conditional subspace.