Hacker News

Looks like the performance is pretty decent, somewhere around Llama3.1 for general knowledge (Tables 17) but still a bit behind in Code and Reasoning (Table 18). Llama3.1 was released about one year ago.