The problem of various ML algorithms "gaming" the reward function, is rather similar to the problem of various financial and economic issues. If people are not trying to do something useful, and then expecting $$ in return for that, but rather are just trying to get $$ without knowing or caring what is productive, then you get a lot of non-productive stuff (spam, scams, pyramid schemes, high-frequency trading, etc.) that isn't actually producing anything, but does take over a larger and larger percentage of the economy.
To mitigate this, you have to have a system outside of that, which penalizes "gaming" the reward function. This system has to have some idea of what real value is, to be able to spot cases where the reward function is high but the value is low. We have a hard enough time of this in the money economy, where we've been learning for centuries. I do not think we are anywhere close in neural networks.
> This system has to have some idea of what real value is
This is probably the most cursed problem ever.
Assume you could develop such a system, why wouldn't you just incorporate its logic into the original fitness function and be done with it?
I think the answer is that such a system can probably never be developed. At some level humans must be involved in order to adapt the function over time in order to meet expectations as training progresses.
The information used to train on is beyond critical, but heuristics regarding what information matters more than other information in a given context might be even more important.
There is some relation to Goedel's theories here, about the inherent limitations of any system of logic to avoid both errors of omission and errors of commission. Either there are true things you cannot prove, or things you "prove" that are not true.
In any reward function, either there are valuable things that are not rewarded, or unvaluable things that are. But having multiple systems to evaluate this, does help.
Commenting to follow this.
There is a step like this in ML. I think it's pretty interesting that topics from things like economics pop up in ML - although perhaps it's not too surprising as we are doing ML for humans to use.
> Commenting to follow this.
You can “favorite” comments on HN to bookmark them.