LLMs have documented position biases, with skew towards first and last. This is strongest in messages due to system prompt + current question training data, but it's present in list data in general.
LLMs have documented position biases, with skew towards first and last. This is strongest in messages due to system prompt + current question training data, but it's present in list data in general.
Exactly. But the papers I’ve seen, the tests are done based on answers being multiple choice usually.
In this case, the questions asked have an answer. The bias would then be on the order of the input data. It’s different enough that it triggered my curiosity.https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00638...