Seems like red team is incentivized to write tests that violate the spec since you're rewarding failed tests.