> students use AI as a personalized tutor but still do the assignments themselves.
So your first study actually concludes the opposite. It concluded that all AI users performed worse, but the effect was smaller for students which used AI as a tutor.
The second meta analysis I don‘t quite understand. I understand they conclude that using AI tutor shows significant improvement, but I don‘t understand the methodology. I may be misunderstanding but it seems to simply count papers which shows positive outcomes and reaches conclusion that way. I think that methodology is deeply flawed as it will amplify whichever biases are present in the studies it uses. I also think the lack of control groups is a major issues. If we are comparing AI tutor to nothing, off course the AI tutor is gonna perform better. We need to compare to traditional methods. And this is especially relevant in our discussion because junior developers usually have excellent access to senior developers (via peer review, pair programing, etc.), much better then student’s access to tutors for that matter.
So out of the meta-analysis I picked the paper with the strongest claim (trying to steel-man it) which is this one: https://online-journal.unja.ac.id/JIITUJ/article/view/34809/...
It claims the following in the abstract:
> The results indicated that students employing AI tutors shown significant improvements in problem-solving and personalized learning compared to the control group.
Now when I look at the control group it claims this (also in the abstract):
> Participants were allocated to a control group receiving conventional training and an experimental group utilizing AI technology,
But when I look into the methodology section I see this:
> The researchers classified the patients into two groups: MathGPT and Flexi 2.0
MathGPT and Flexi 2.0 are both AI tutors. Now I am confused, where is the control group and how was this “conventional training conducted”?
The methodology section actually tells a different story from the abstract:
> This research utilized a quantitative methodology via a quasi-experimental design.
By quasi-experimental design they mean that they tested the same students before and after AI intervention. And concluded that the AI tutor helped them improve. Now this is not what control group means, so the researchers are actually lying by omission in the abstract. This is a spectacularly bad experimental design and I wonder how it would pass peer review, so I look at the publisher Jurnal Ilmiah Ilmu Terapan Universitas Jambi. So not exactly a reputable journal.
I still stand by my no evidence for a testable hypotheses. I suspect that your first link is actually correct in that AI is bad for students and just less bad if it is used as a tutor.