Hierarchical deconstruction and memoization of goal-directed plans

Huys, Q; Lally, N; Falkner, P; Gershman, P; Dayan, P; Roiser, J

Local TagsRelease HistoryDetailsSummary

Hierarchical deconstruction and memoization of goal-directed plans

Huys, Q., Lally, N., Falkner, P., Gershman, P., Dayan, P., & Roiser, J. (2013). Hierarchical deconstruction and memoization of goal-directed plans. Poster presented at 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013), Princeton, NJ, USA.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0004-DAF0-7 Version Permalink: https://hdl.handle.net/21.11116/0000-0004-DAF1-6

Genre: Poster

Files

show Files

Locators

show

hide

Locator:
http://rldm.org/wp-content/uploads/2013/10/RLDM13AbstractsBooklet.pdf (Abstract) Open Access status unknown

Description:
-

OA-Status:

Creators

show

hide

Creators:
Huys, Q, Author
Lally, N, Author
Falkner, P, Author
Gershman, P, Author
Dayan, P¹, Author
Roiser, J, Author

Affiliations:
1External Organizations, ou_persistent22

Content

show

hide

Free keywords: -

Abstract: Humans cannot exactly solve most planning problems they face, but must approximate them.
Research has characterized one class of approximations, whereby experience accumulated in habits substi-
tutes for the computational expense of searching goal-directed decision-trees. Building on previous work in
which we characterised Pavlovian pruning of decision trees, we explore more efficient approximations to
the planning problem. We focus on habit-like caching (’memoization’) of more complex action sequences
in dynamic, hierarchical decompositions of complex decision-trees.
Three groups of subjects performed a planning task. They first learned a transition matrix and then learned
about rewards associated with the transitions. They then produced choice sequences of a given length from
a random starting state to maximise their total earnings. Using reinforcement learning models nested inside
a Chinese Restaurant Process we infer subject’s hierarchical decomposition, stochastic memoization and
pruning strategies. Hierarchical decomposition and stochastic memoization models give detailed accounts
of complex features of choice data. We characterise how subjects dynamically establish subgoals; that their
decomposition strategy achieves a near optimal trade-off between computational costs and gains; that sub-
jects memoized and re-used complex choice sequences; that this correlated negatively with their ability to
search the tree; and replicated our previous findings whereby subjects disregard (prune) subtrees that lie
below large losses. We replicate all findings in two further datasets.
Humans employ multiple approximations when solving complex planning problems. They simplify the
problem by establishing sub-goals and decomposing the decision-tree around these in a manner that trades
computational costs for gains near optimally. They do not re-compute solutions on every trial but rather re-use previous solutions.

Details

show

hide

Language(s):

Dates: Published Online: 2013-10

Publication Status: Published online

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: -

Degree: -

Event

show

hide

Title: 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013)

Place of Event: Princeton, NJ, USA

Start-/End Date: 2013-10-25 - 2013-10-27

Legal Case

show

Project information

show

Source 1

show

hide

Title: 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013)

Source Genre: Proceedings

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: - Sequence Number: F55 Start / End Page: 40 Identifier: -