really? I find newer models hallucinate less, and I think they have room for improvement, with better training.

I believe hallucinations are partly an artifact of imperfect model training, and thus can be ameliorated with better technique.

Yes, really!

Smaller models may hallucinate less: https://www.intel.com/content/www/us/en/developer/articles/t...

The RAG technique uses a smaller model and an external knowledge base that's queried based on the prompt. The technique allows small models to outperform far larger ones in terms of hallucinations, at the cost of performance. That is, to eliminate hallucinations, we should alter how the model works, not increase its scale: https://highlearningrate.substack.com/p/solving-hallucinatio....

Pruned models, with fewer parameters, generally have a lower hallucination risk: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00695.... "Our analysis suggests that pruned models tend to generate summaries that have a greater lexical overlap with the source document, offering a possible explanation for the lower hallucination risk."

At the same time, all of this should be contrasted with the "Bitter Lesson" (https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...). IMO, making a larger LLMs does indeed produce a generally superior LLM. It produces more trained responses to a wider set of inputs. However, it does not change that it's an LLM, so fundamental traits of LLMs - like hallucinations - remain.