English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Reinforcement comparison

Dayan, P. (1991). Reinforcement comparison. In D. Touretzky, J. Elman, T. Sejnowski, & G. Hinton (Eds.), Connectionist Models: Proceedings of the 1990 Summer School (pp. 45-51). San Mateo, CA, USA: Morgan Kaufmann.

Item is

Basic

show hide
Genre: Conference Paper

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Dayan, P1, Author              
Affiliations:
1External Organizations, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: Sutton [in his PhD thesis] introduced a reinforcement comparison term into the equations governing certain stochastic learning automata, arguing that it should speed up learning, particularly for unbalanced reinforcement tasks. Williams's subsequent extensions [REINFORCE] to the class of algorithms demonstrated that they were all performing approximate stochastic gradient ascent, but that, in terms of expectations, the comparison term has no first order effect. This paper analyses the second order contribution, and uses the criterion that its modulus should be minimised to determine an optimal value for the comparison term. This value turns out to be different from the one Sutton used, and simulations suggest at its efficacy.

Details

show
hide
Language(s):
 Dates: 1991
 Publication Status: Published in print
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: -
 Degree: -

Event

show
hide
Title: 1990 Connectionist Models Summer School
Place of Event: San Diego, CA, USA
Start-/End Date: -

Legal Case

show

Project information

show

Source 1

show
hide
Title: Connectionist Models: Proceedings of the 1990 Summer School
Source Genre: Proceedings
 Creator(s):
Touretzky, DS, Editor
Elman, JL, Editor
Sejnowski, TJ, Editor
Hinton, GE, Editor
Affiliations:
-
Publ. Info: San Mateo, CA, USA : Morgan Kaufmann
Pages: 404 Volume / Issue: - Sequence Number: - Start / End Page: 45 - 51 Identifier: ISBN: 1-55860-156-2