Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Efficient Sample Reuse in EM-Based Policy Search

MPG-Autoren
/persons/resource/persons84135

Peters,  J
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Hachiya, H., Peters, J., & Sugiyama, M. (2009). Efficient Sample Reuse in EM-Based Policy Search. In W. Buntine, M. Grobelnik, D. Mladenic, & J. Shaw-Taylor (Eds.), Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009 (pp. 469-484). Berlin, Germany: Springer.


Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-C307-1
Zusammenfassung
Direct policy search is a promising reinforcement learning framework in particular for controlling in continuous, high-dimensional systems such as anthropomorphic robots. Policy search often requires a large number of samples for obtaining a stable policy update estimator due to its high flexibility. However, this is prohibitive when the sampling cost is expensive. In this paper, we extend a EM-based policy search method so that previously collected samples can be efficiently reused. The usefulness of the proposed method, called Reward-weighted Regression with sample Reuse, is demonstrated through a robot learning experiment.