I think it will make results way better and more representative of model abilities..
It would... but the test is inherently silly, so I'm still not sure if it's worth me investing that extra effort in it.
It would... but the test is inherently silly, so I'm still not sure if it's worth me investing that extra effort in it.