I think the solution is a bunch of private trusted benchmarks, and averaging their announced results.

> averaging their announced results.

Obligatory XKCD: https://xkcd.com/937/