A Survey of Preference-Based Reinforcement Learning Methods

TLDR

A unified framework for PbRL is provided that describes the task formally and points out the different design principles that affect the evaluation task for the human as well as the computational complexity.