Reinforcement Learning of Motor Skills with Policy Gradients

Peters, J; Schaal, S

doi:10.1016/j.neunet.2008.02.003

Lokale TagsFreigabegeschichteDetailsÜbersicht

Reinforcement Learning of Motor Skills with Policy Gradients

Peters, J., & Schaal, S. (2008). Reinforcement Learning of Motor Skills with Policy Gradients. Neural networks, 21(4), 682-697. doi:10.1016/j.neunet.2008.02.003.

Item is Freigegeben

einblenden: alle

Basisdaten

ausblenden:

Datensatz-Permalink: https://hdl.handle.net/11858/00-001M-0000-0013-C96D-9 Versions-Permalink: https://hdl.handle.net/21.11116/0000-0003-309A-9

Genre: Zeitschriftenartikel

Dateien

einblenden: Dateien

Externe Referenzen

ausblenden:

externe Referenz:
https://www.sciencedirect.com/science/article/pii/S0893608008000701 (Verlagsversion) Open Access Status unbekannt

Beschreibung:
-

OA-Status:

Urheber

ausblenden:

Urheber:
Peters, J^{1, 2}, Autor
Schaal, S, Autor

Affiliations:
1Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795
2Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497794

Inhalt

ausblenden:

Schlagwörter: -

Zusammenfassung: Autonomous learning is one of the hallmarks of human and animal behavior, and understanding the principles of learning will be crucial in order to achieve true autonomy in advanced machines like humanoid robots. In this paper, we examine learning of complex motor skills with human-like limbs. While supervised learning can offer useful tools for bootstrapping behavior, e.g., by learning from demonstration, it is only reinforcement learning that offers a general approach to the final trial-and-error improvement that is needed by each individual acquiring a skill. Neither neurobiological nor machine learning studies have, so far, offered compelling results on how reinforcement learning can be scaled to the high-dimensional continuous state and action spaces of humans or humanoids. Here, we combine two recent research developments on learning motor control in order to achieve this scaling. First, we interpret the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning. Second, we combine motor primitives with the theory of stochastic policy gradient learning, which currently seems to be the only feasible framework for reinforcement learning for humanoids. We evaluate different policy gradient methods with a focus on their applicability to parameterized motor primitives. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.

Details

ausblenden:

Sprache(n):

Datum: Erschienen: 2008-05

Publikationsstatus: Erschienen

Seiten: -

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: DOI: 10.1016/j.neunet.2008.02.003
BibTex Citekey: 4867

Art des Abschluß: -

Quelle 1

ausblenden:

Titel: Neural networks

Genre der Quelle: Zeitschrift

Urheber:

Affiliations:

Ort, Verlag, Ausgabe: New York : Pergamon

Seiten: - Band / Heft: 21 (4) Artikelnummer: - Start- / Endseite: 682 - 697 Identifikator: ISSN: 0893-6080
CoNE: https://pure.mpg.de/cone/journals/resource/954925558496

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle 1