I think we have all seen the latest models turn into a hot mess.

i interpret figure 2 as showing that incoherence increases with model gens, albeit on a small sample size