There's a big difference between the type of errors that humans make when they misunderstand a subject, and the type of errors an LLM makes. I'm not well enough versed in the field to know which type of errors are found in this paper, but it seems that people who are versed in this field feel that these errors are of the latter type.