Hacker News

> I don't see how it's any different than optimizing for new CPU/GPU architectures

I mean that seems wild to say to me. Those architectures have documentation and aren't magic black boxes that we chuck inputs at and hope for the best: we do pretty much that with LLMs.

If that's how you optimise, I'm genuinely shocked.

swyx a year ago [ - ]

i bet if we talked to a real low level hardware systems/chip engineer they'd laugh and take another shot at how we put them on a pedestal

girvo a year ago [ - ]

Not really, in my experience. There's still fundamental differences between designed systems and trained LLMs.