> This system has to have some idea of what real value is
This is probably the most cursed problem ever.
Assume you could develop such a system, why wouldn't you just incorporate its logic into the original fitness function and be done with it?
I think the answer is that such a system can probably never be developed. At some level humans must be involved in order to adapt the function over time in order to meet expectations as training progresses.
The information used to train on is beyond critical, but heuristics regarding what information matters more than other information in a given context might be even more important.
There is some relation to Goedel's theories here, about the inherent limitations of any system of logic to avoid both errors of omission and errors of commission. Either there are true things you cannot prove, or things you "prove" that are not true.
In any reward function, either there are valuable things that are not rewarded, or unvaluable things that are. But having multiple systems to evaluate this, does help.