English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Fitted Q-iteration by Advantage Weighted Regression

Neumann, G., & Peters, J. (2009). Fitted Q-iteration by Advantage Weighted Regression. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in neural information processing systems 21 (pp. 1177-1184). Red Hook, NY, USA: Curran.

Item is

Basic

show hide
Item Permalink: http://hdl.handle.net/11858/00-001M-0000-0013-C47D-F Version Permalink: http://hdl.handle.net/21.11116/0000-0002-DE63-5
Genre: Conference Paper

Files

show Files

Creators

show
hide
 Creators:
Neumann, G, Author
Peters, J1, 2, Author              
Affiliations:
1Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795              
2Max Planck Institute for Biological Cybernetics, Max Planck Society, Spemannstrasse 38, 72076 Tübingen, DE, ou_1497794              

Content

show
hide
Free keywords: -
 Abstract: Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sample efficiency, a more stable learning process and the higher quality of the resulting policy. However, these methods remain hard to use for continuous action spaces which frequently occur in real-world tasks, e.g., in robotics and other technical applications. The greedy action selection commonly used for the policy improvement step is particularly problematic as it is expensive for continuous actions, can cause an unstable learning process, introduces an optimization bias and results in highly non-smooth policies unsuitable for real-world systems. In this paper, we show that by using a soft-greedy action selection the policy improvement step used in FQI can be simplified to an inexpensive advantage-weighted regression. With this result, we are able to derive a new, computationally efficient FQI algorithm which can even deal with high dimensional action spaces.

Details

show
hide
Language(s):
 Dates: 2009-06
 Publication Status: Published in print
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: 5520
 Degree: -

Event

show
hide
Title: Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS 2008)
Place of Event: Vancouver, BC, Canada
Start-/End Date: 2008-12-08 - 2008-12-10

Legal Case

show

Project information

show

Source 1

show
hide
Title: Advances in neural information processing systems 21
Source Genre: Proceedings
 Creator(s):
Koller, D, Editor
Schuurmans, D, Editor
Bengio, Y, Editor
Bottou, L, Editor
Affiliations:
-
Publ. Info: Red Hook, NY, USA : Curran
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 1177 - 1184 Identifier: ISBN: 978-1-60560-949-2