Is there value in using deep RL for problems that seem more suited to planning-based approaches?