The role of prefrontal cortex and basal ganglia in model-based and model-free 
reinforcement learning

Miranda, B; Malalasekera, N; Dayan, P

Local TagsRelease HistoryDetailsSummary

The role of prefrontal cortex and basal ganglia in model-based and model-free reinforcement learning

Miranda, B., Malalasekera, N., & Dayan, P. (2013). The role of prefrontal cortex and basal ganglia in model-based and model-free reinforcement learning. Poster presented at 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013), Princeton, NJ, USA.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0004-DAF7-0 Version Permalink: https://hdl.handle.net/21.11116/0000-0004-DAF8-F

Genre: Poster

Files

show Files

Locators

show

hide

Locator:
http://rldm.org/wp-content/uploads/2013/10/RLDM13AbstractsBooklet.pdf (Abstract) Open Access status unknown

Description:
-

OA-Status:

Creators

show

hide

Creators:
Miranda, B, Author
Malalasekera, N, Author
Dayan, P¹, Author

Affiliations:
1External Organizations, ou_persistent22

Content

show

hide

Free keywords: -

Abstract: Animals can learn to influence their environment either by exploiting stimulus-response associa-
tions that have been productive in the past, or by predicting the likely worth of actions in the future based on
their causal relationships with outcomes. These respectively model-free (MF) and model-based (MB) strate-
gies are supported by structures including midbrain dopaminergic neurons, striatum and prefrontal cortex
(PFC), but it is not clear how they interact to realize these two types of reinforcement learning (RL).
We trained rhesus monkeys to perform a two-stage Markov decision task that induces a combination of MB
and MF behavior. The task starts with a choice between two options. Each of these is more often associated
with one of two second-stage states with probabilities that are fixed throughout the experiment. A second
two-option choice is required in order to obtain one of three different levels of reward. These second-stage
outcomes change independently, according to a random walk, and thus induce exploration.
A descriptive analysis of our behavioral data shows that the immediate reward history (of MF and MB
importance) and the interaction between reward history and the structure of the task (of MB importance) both
significantly influenced stage one choices. On the other hand, only the immediate reward history seemed
to influence reaction time. When we performed a trial-by-trial computational analysis on our data using
different RL algorithms, we found that in the model that best fit the data, choices were made according to a
weighted combination of MF-RL and MB-RL action values (with a weight for MB-RL of 84.3
±
3.2 %).
Our behavioral findings support a more integrated view of MF and MB learning strategies. They also illu-
minate the way that the vigor of responding relates to average rate of reward delivery. Neurophysiological
recordings are currently being performed in subregions of PFC and the striatum during task performance.

Details

show

hide

Language(s):

Dates: Published Online: 2013-10

Publication Status: Published online

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: -

Degree: -

Event

show

hide

Title: 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013)

Place of Event: Princeton, NJ, USA

Start-/End Date: 2013-10-25 - 2013-10-27

Legal Case

show

Project information

show

Source 1

show

hide

Title: 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013)

Source Genre: Proceedings

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: - Sequence Number: S30 Start / End Page: 61 Identifier: -