The combination of "thinking models" plus the blind focus on incremental benchmarking gains was a mistake for practical use.
You definitely want that for some tasks, but for the majority of tasks there is a lot of space for cheap & cheerful (and non-thinking)