Hacker News

simonw 2 hours ago [ - ]

100% true - I only had five minutes so I had to edit it down to just a couple, but all of those models are excellent and keep leap-frogging each other.

rahimnathwani an hour ago [ - ]

Looking forward to next time, hoping you mention speculative decoding and MTP :)

It would support your point about the performance of 20GB local models.