On The Hardness of Reinforcement Learning With Value-Function Approximation (and The Lack of Understanding Thereof)
Value-function approximation methods that operate in batch mode have foundational importance to reinforcement learning (RL). Finite sample guarantees for these methods—which provide the theoretical backbones for empirical (“deep”) RL today—crucially rely on strong representation assumptions,…