Hacker News

markisus 5 days ago [ - ]

Did this end up working? It sounds plausible but it needs some empirical validation.

Maxious 4 days ago [ - ]

There was skepticism last time this was posted https://news.ycombinator.com/item?id=37740932

Implementation for gpt-oss this week showed 2-3x improvements https://github.com/ggml-org/llama.cpp/pull/15157 https://www.reddit.com/r/LocalLLaMA/comments/1mkowrw/llamacp...

serialx 5 days ago [ - ]

Yeah, attention sinks were applied to gpt-oss