Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark

Riedmiller, M; Peters, J; Schaal, S

doi:10.1109/ADPRL.2007.368196

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Bitte beachten Sie, dass eine neuere Version dieses Datensatzes verfügbar ist:
https://pure.mpg.de/pubman/item/item_1790533_2

DetailsÜbersicht

Freigegeben

Konferenzbeitrag

Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark

MPG-Autoren

/persons/resource/persons84135

Peters, J
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Riedmiller, M., Peters, J., & Schaal, S. (2007). Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark. Proceedings of the 2007 IEEE Internatinal Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL 2007), 254-261.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-CE1B-8

Zusammenfassung

In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, ‘vanilla‘ policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.