English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Meeting Abstract

Trials-with-fewer-errors: Feature-based learning and exploration

MPS-Authors
There are no MPG-Authors in the publication available
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Stojic, H., Analytis, P., Dayan, P., & Speekenbrink, M. (2017). Trials-with-fewer-errors: Feature-based learning and exploration. In MathPsych 2017 ICCM (pp. 129).


Cite as: https://hdl.handle.net/21.11116/0000-0004-7E55-0
Abstract
Reinforcement learning algorithms have provided useful insights into human and an-
imal learning and decision making. However, they perform poorly when faced with
real world cases in which the quality of options is signalled by multiple potential
features. We propose an approximate Bayesian optimization framework for tack-
ling such problems. The framework relies on similarity-based learning of functional
relationships between features and rewards, and choice rules that use uncertainty
in balancing the exploration-exploitation trade-o

. We can expect decision makers
who learn functional relationships – function learners – to exhibit various charac-
teristic behaviours. First, they will quickly come to avoid exploring options for
which the reward function predicts low rewards. Second, if their priors do not cor-
respond to the current environment, then function learners will be led astray by
feature information. Third, function learners will explore options to enhance their
functional knowledge, i.e., including the uncertainty associated with the impact of
features in making their choices. We tested our framework using a series of novel
multi-armed bandit experiments (N=1068) in which rewards were noisy functions
of two observable features. We compared human behaviour in these problems to
solutions provided by Bayesian models. The participants did not perform as well
as optimal Bayesian inference as a whole; and indeed some ignored the feature in-
formation and relied on reward information only. However, others showed various
signatures of Bayesian optimization including being guided by prior expectations
about reward functions, taking uncertainty into account when choosing between
options, and updating expectations appropriately in light of experiences.