Peters, J External Organizations; Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;
https://doi.org/10.48550/arXiv.2001.02435 (Preprint)
https://proceedings.mlr.press/v108/tosatto20a.html (Publisher version)
Tosatto, S., Carvalho, J., Abdulsamad, H., & Peters, J. (2020). A Nonparametric Off-Policy Policy Gradient. In S. Chiappa, & R. Calandra (Eds.), Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS) 2020 (pp. 167-177). PMLR. Retrieved from https://proceedings.mlr.press/v108/tosatto20a.html.