But does it really matter? It seems fairly obvious that AI is going to outperform professors. While the studies run, there are three more model releases that change the calculus entirely. I wonder how much we are learning with these studies about what is going on.

> I wonder how much we are learning with these studies about what is going on.

So your alternative is to not have any studies and everyone can just stump up anecdata as "evidence" for the capabilities of these models?

Doing things that are well meaning, but ineffective is not great policy. The simplest alternative to doing things that don't work is always not doing them. Better ideas are of course welcome, but not required.

I don't think that's how science/academia works. There is no such thing as a perfect study, there are always non-idealities and noise in the data. Good studies make well-justified efforts to account for these, OP is saying they don't believe this is the case here.

Regardless, your assertion that "oh well, the models will be totally different in a few months anyway, therefore any study done today is pointless" seems more than a stretch. How do you know they will be so different? How can you verify that today's studies are completely irrelevant?

it sounds like you are saying science doesn't matter but your feelings do

Does it matter if a study is fraudulent or incompetent? Yes.

That is the assumed narrative; however it shouldn’t bias any evidence.