Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning

Li, W., Bohg, J., & Fritz, M. (2017). Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning. Retrieved from http://arxiv.org/abs/1711.00267.

Item is

Basisdaten

einblenden: ausblenden:
Genre: Forschungspapier

Dateien

einblenden: Dateien
ausblenden: Dateien
:
arXiv:1711.00267.pdf (Preprint), 445KB
Name:
arXiv:1711.00267.pdf
Beschreibung:
File downloaded from arXiv at 2018-02-01 09:45
OA-Status:
Sichtbarkeit:
Öffentlich
MIME-Typ / Prüfsumme:
application/pdf / [MD5]
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Li, Wenbin1, Autor           
Bohg, Jeannette2, Autor
Fritz, Mario1, Autor           
Affiliations:
1Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society, ou_1116547              
2External Organizations, ou_persistent22              

Inhalt

einblenden:
ausblenden:
Schlagwörter: Computer Science, Robotics, cs.RO,Computer Science, Artificial Intelligence, cs.AI,Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Learning, cs.LG
 Zusammenfassung: Understanding physical phenomena is a key component of human intelligence and enables physical interaction with previously unseen environments. In this paper, we study how an artificial agent can autonomously acquire this intuition through interaction with the environment. We created a synthetic block stacking environment with physics simulation in which the agent can learn a policy end-to-end through trial and error. Thereby, we bypass to explicitly model physical knowledge within the policy. We are specifically interested in tasks that require the agent to reach a given goal state that may be different for every new trial. To this end, we propose a deep reinforcement learning framework that learns policies which are parametrized by a goal. We validated the model on a toy example navigating in a grid world with different target positions and in a block stacking task with different target structures of the final tower. In contrast to prior work, our policies show better generalization across different goals.

Details

einblenden:
ausblenden:
Sprache(n): eng - English
 Datum: 2017-11-012017-11-222017
 Publikationsstatus: Online veröffentlicht
 Seiten: 10 p.
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: arXiv: 1711.00267
URI: http://arxiv.org/abs/1711.00267
BibTex Citekey: Li1711.00267
 Art des Abschluß: -

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle

einblenden: