Hacker News

zbentley 10 hours ago [ - ]

Practically, the performance loss of making it truly repeatable (which takes parallelism reduction or coordination overhead, not just temperature and randomizer control) is unacceptable to most people.

wat10000 8 hours ago [ - ]

It's also just not very useful. Why would you re-run the exact same inference a second time? This isn't like a compiler where you treat the input as the fundamental source of truth, and want identical output in order to ensure there's no tampering.