Hacker News

Nobody that I'd be using this analogy with is currently using LLMs for tasks that are covered by RLVF. They're asking models for factual information about the real world (Google replacement), or to generate text (write a cover letter), not the type of outputs that are verifiable within formal systems—by definition the type of output that RLVF is intended to improve. The actor analogy is still helpful for providing intuition to non-technical people who don't know how to think about LLMs, but do use them.

Also, unless I am mistaken, RLVF changes the training to make LLMs less likely to hallucinate, but in no way does it make hallucination impossible. Under the hood, the models still work the same way (after training), and the analogy still applies, no?

red75prime 3 days ago [ - ]

> Under the hood, the models still work the same way (after training), and the analogy still applies, no?

Under the hood we have billions of parameters that defy any simple analogies.

Operations of a network are shaped by human data. But the structure of the network is not like the human brain. So, we have something that is human-like in some ways, but deviates from humans in ways, which are unlikely to be like anything we can observe in humans (and use as a basis for analogy).