Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

Publication
In NeuRIPS 2019