That's a different and much more boring type of cheating. The interesting part of the METR report is that the model is hacking the evaluation environment, not that some AI model provider is hardcoding answers to known evaluation questions. (which wouldn't require the model to cheat/hack)