No benchmark will be perfect, especially if it's public but it's a fun experiment to visually see how these models get better and better.
No benchmark will be perfect, especially if it's public but it's a fun experiment to visually see how these models get better and better.