Why would it be a "waste of time"?

We are just getting into the nitty-gritty of LLM benchmarking - to be fair they still need to go a long way still IMO. But it's incredibly exciting that a local run LLM is capable of producing similar results as a SOTA model.