Does fine tuning really improve anything above just pure RAG approaches for usee cases that involve tons of direct document context?

Specialised models easily beat SOTA, case in point: https://nehmeailabs.com/flashcheck

[dead]