Help Privacy Policy Disclaimer
  Advanced SearchBrowse




Conference Paper

Hierarchical Relative Entropy Policy Search


Peters,  J
Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;

External Resource
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Daniel, C., Neumann, G., & Peters, J. (2012). Hierarchical Relative Entropy Policy Search. In N. Lawrence, & M. Girolami (Eds.), Artificial Intelligence and Statistics, 21-23 April 2012, La Palma, Canary Islands (pp. 273-281). Madison, WI, USA: International Machine Learning Society.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-B7E8-8
Many real-world problems are inherently hi- erarchically structured. The use of this struc- ture in an agent's policy may well be the key to improved scalability and higher per- formance. However, such hierarchical struc- tures cannot be exploited by current policy search algorithms. We will concentrate on a basic, but highly relevant hierarchy - the 'mixed option' policy. Here, a gating network first decides which of the options to execute and, subsequently, the option-policy deter- mines the action. In this paper, we reformulate learning a hi- erarchical policy as a latent variable estima- tion problem and subsequently extend the Relative Entropy Policy Search (REPS) to the latent variable case. We show that our Hierarchical REPS can learn versatile solu- tions while also showing an increased perfor- mance in terms of learning speed and quality of the found policy in comparison to the non- hierarchical approach.