I think it means most of the training data is short. And a lot of the long-context examples are conversations where the middle turns are less important.
I think it means most of the training data is short. And a lot of the long-context examples are conversations where the middle turns are less important.