Hachiya, H. Max Planck Society;
Peters, J. Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;
Hachiya, H., Peters, J., & Sugiyama, M. (2011). Reward-Weighted Regression with Sample Reuse for Direct Policy Search in Reinforcement Learning. Neural Computation, 23(11), 2798-2832. doi:10.1162/NECO_a_00199.