Hacker News

Y

Hacker News

new | ask | show | jobs

slashdave 5 hours ago [ - ]

RL is more than facts. Synthetic feedback is an obvious approach. Does the model suggest code that compiles and performs well?