TD(λ) converges with probability 1

Dayan, P; Sejnowski, TJ

doi:10.1007/BF00993978

Local TagsRelease HistoryDetailsSummary

TD(λ) converges with probability 1

Dayan, P., & Sejnowski, T. (1994). TD(λ) converges with probability 1. Machine Learning, 14(3), 295-301. doi:10.1007/BF00993978.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0002-D6E2-D Version Permalink: https://hdl.handle.net/21.11116/0000-0002-D6E3-C

Genre: Journal Article

Files

show Files

Locators

show

hide

Locator:
https://link.springer.com/content/pdf/10.1007%2FBF00993978.pdf (Publisher version) Open Access status unknown

Description:
-

OA-Status:

Creators

show

hide

Creators:
Dayan, P¹, Author
Sejnowski, TJ, Author

Affiliations:
1External Organizations, ou_persistent22

Content

show

hide

Free keywords: -

Abstract: The methods of temporal differences (Samuel, 1959; Sutton, 1984, 1988) allow an agent to learn accurate predictions of stationary stochastic future outcomes. The learning is effectively stochastic approximation based on samples extracted from the process generating the agent's future.

Sutton (1988) proved that for a special case of temporal differences, the expected values of the predictions converge to their correct values, as large samples are taken, and Dayan (1992) extended his proof to the general case. This article proves the stronger result that the predictions of a slightly modified form of temporal difference learning converge with probability one, and shows how to quantify the rate of convergence.

Details

show

hide

Language(s):

Dates: Date issued: 1994-03

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: DOI: 10.1007/BF00993978

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show

hide

Title: Machine Learning

Source Genre: Journal

Creator(s):

Affiliations:

Publ. Info: Dordrecht : Springer

Pages: - Volume / Issue: 14 (3) Sequence Number: - Start / End Page: 295 - 301 Identifier: ISSN: 0885-6125
CoNE: https://pure.mpg.de/cone/journals/resource/08856125