Reinforcing connectionism: learning the statistical way

Dayan, PS

Local TagsRelease HistoryDetailsSummary

Reinforcing connectionism: learning the statistical way

Dayan, P. (1991). Reinforcing connectionism: learning the statistical way. PhD Thesis, University of Edinburgh, Edinburgh, UK.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0002-D8A5-0 Version Permalink: https://hdl.handle.net/21.11116/0000-0002-D8A6-F

Genre: Thesis

Files

show Files

hide Files

:

Dayan1991.pdf (Publisher version), 9MB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-0002-D8A7-E

Name:
Dayan1991.pdf

Description:
-

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
-

Locators

show

Creators

show

hide

Creators:
Dayan, PS¹, Author

Affiliations:
1External Organizations, ou_persistent22

Content

show

hide

Free keywords: -

Abstract: Connectionism's main contribution to cognitive science will prove to be the renewed impetus it has imparted to learning. Learning can be integrated into the existing theoretical foundations of the subject, and the combination, statistical computational theories, provide a framework within which many connectionist mathematical mechanisms naturally fit. Examples from supervised and reinforcement learning demonstrate this. Statistical computational theories already exist for certainn associative matrix memories. This work is extended, allowing real valued synapses and arbitrarily biased inputs. It shows that a covariance learning rule optimises the signal/noise ratio, a measure of the potential quality of the memory, and quantifies the performance penalty incurred by other rules. In particular two that have been suggested as occuring naturally are shown to be asymptotically optimal in the limit of sparse coding. The mathematical model is justified in comparison with other treatments whose results differ. Reinforcement comparison is a way of hastening the learning of reinforcement learning systems in statistical environments. Previous theoretical analysis has not distinguished between different comparison terms, even though empirically, a covariance rule has been shown to be better than just a constant one. The workings of reinforcement comparison are investigated by a second order analysis of the expected statistical performance of learning, and an alternative rule is proposed and empirically justified. The existing proof that temporal difference prediction learning converges in the mean is extended from a special case involving adjacent time steps to the general case involving arbitary ones. The interaction between the statistical mechanism of temporal difference and the linear representation is particularly stark. The performance of the method given a linearly dependent representation is also analysed.

Details

show

hide

Language(s):

Dates: Date issued: 1991-02-25

Publication Status: Issued

Pages: -

Publishing info: Edinburgh, UK : University of Edinburgh

Table of Contents: -

Rev. Type: -

Identifiers: -

Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show