Minimax Weight and Q-Function Learning for Off-Policy Evaluation

Publication
In ICML 2020