Human in the loop reinforcement learning

More specifically, fruit in the loop reinforcement learning.