I have no idea if the evaluator themselves is trustworthy, but it was supposedly independently evaluated by Appen: https://www.appen.com/whitepapers/benchmarking-subquadratics...
I have no idea if the evaluator themselves is trustworthy, but it was supposedly independently evaluated by Appen: https://www.appen.com/whitepapers/benchmarking-subquadratics...