Trials-with-fewer-errors: Feature-based learning and exploration

Stojic, H; Analytis, PP; Dayan, P; Speekenbrink, M

Local TagsRelease HistoryDetailsSummary

Trials-with-fewer-errors: Feature-based learning and exploration

Stojic, H., Analytis, P., Dayan, P., & Speekenbrink, M. (2017). Trials-with-fewer-errors: Feature-based learning and exploration. In MathPsych 2017 ICCM (pp. 129).

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0004-7E55-0 Version Permalink: https://hdl.handle.net/21.11116/0000-0004-7E56-F

Genre: Meeting Abstract

Files

show Files

Locators

show

hide

Locator:
http://mathpsych.org/conferences/2017/file/MP_ICCM2017_Abstract_Booklet.pdf (Abstract) Open Access status unknown

Description:
-

OA-Status:

Creators

show

hide

Creators:
Stojic, H, Author
Analytis, PP, Author
Dayan, P¹, Author
Speekenbrink, M, Author

Affiliations:
1External Organizations, ou_persistent22

Content

show

hide

Free keywords: -

Abstract: Reinforcement learning algorithms have provided useful insights into human and an-
imal learning and decision making. However, they perform poorly when faced with
real world cases in which the quality of options is signalled by multiple potential
features. We propose an approximate Bayesian optimization framework for tack-
ling such problems. The framework relies on similarity-based learning of functional
relationships between features and rewards, and choice rules that use uncertainty
in balancing the exploration-exploitation trade-o
↵
. We can expect decision makers
who learn functional relationships – function learners – to exhibit various charac-
teristic behaviours. First, they will quickly come to avoid exploring options for
which the reward function predicts low rewards. Second, if their priors do not cor-
respond to the current environment, then function learners will be led astray by
feature information. Third, function learners will explore options to enhance their
functional knowledge, i.e., including the uncertainty associated with the impact of
features in making their choices. We tested our framework using a series of novel
multi-armed bandit experiments (N=1068) in which rewards were noisy functions
of two observable features. We compared human behaviour in these problems to
solutions provided by Bayesian models. The participants did not perform as well
as optimal Bayesian inference as a whole; and indeed some ignored the feature in-
formation and relied on reward information only. However, others showed various
signatures of Bayesian optimization including being guided by prior expectations
about reward functions, taking uncertainty into account when choosing between
options, and updating expectations appropriately in light of experiences.

Details

show

hide

Language(s):

Dates: Published Online: 2017-07

Publication Status: Published online

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: -

Degree: -

Event

show

hide

Title: 50th Annual Meeting of the Society for Mathematical Psychology, the European Mathematical Psychology Group, 15th Annual Meeting of the International Conference on Cognitive Modelling (MathPsych/ICCM 2017)

Place of Event: Warwick, UK

Start-/End Date: 2017-07-22 - 2017-07-25

Legal Case

show

Project information

show

Source 1

show

hide

Title: MathPsych 2017 ICCM

Source Genre: Proceedings

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 129 Identifier: -