I'm reading: the difference is that this is an agent as a judge rather than an LLM as a judge, paired with more structured judging parameters. Is that right? Is the agent just a loop over each criterium, or is it also reflecting somehow on its judging or similar?