Learning Control and Planning from the View of Control Theory and Imitation

Peters, J; Schaal, S

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Vortrag

Learning Control and Planning from the View of Control Theory and Imitation

MPG-Autoren

Es sind keine MPG-Autoren in der Publikation vorhanden

Externe Ressourcen

https://pure.mpg.de/pubman/faces/ViewItemFullPage.jsp?itemId=item_1792220_1
(Zusammenfassung)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Peters, J., & Schaal, S. (2003). Learning Control and Planning from the View of Control Theory and Imitation. Talk presented at NIPS 2003 Workshop: Planning for the Real World: The promises and challenges of dealing with uncertainty. Whistler, BC, Canada.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-DAB5-B

Zusammenfassung

Learning control and planning in high dimensional continuous state-action systems, e.g., as needed in a humanoid robot, has so far been a domain beyond the applicability of generic planning techniques like reinforcement learning and dynamic programming. This talk describes an approach we have taken in order to enable complex robotics systems to learn to accomplish control tasks. Adaptive learning controllers equipped with statistical learning techniques can be used to learn tracking controllers -- missing state information and uncertainty in the state estimates are usually addressed by observers or direct adaptive control methods. Imitation learning is used as an ingredient to seed initial control policies whose output is a desired trajectory suitable to accomplish the task at hand. Reinforcement learning with stochastic policy gradients using a natural gradient forms the third component that allows refining the initial control policy until the task is accomplished. In comparison to general learning control, this approach is highly prestructured and thus more domain specific. However, it seems to be a theoretically clean and feasible strategy for control systems of the complexity that we need to address.