English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Hierarchical deconstruction and memoization of goal-directed plans

Huys, Q., Lally, N., Falkner, P., Gershman, P., Dayan, P., & Roiser, J. (2013). Hierarchical deconstruction and memoization of goal-directed plans. Poster presented at 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013), Princeton, NJ, USA.

Item is

Files

show Files

Locators

show
hide
Description:
-
OA-Status:

Creators

show
hide
 Creators:
Huys, Q, Author
Lally, N, Author
Falkner, P, Author
Gershman, P, Author
Dayan, P1, Author           
Roiser, J, Author
Affiliations:
1External Organizations, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: Humans cannot exactly solve most planning problems they face, but must approximate them.
Research has characterized one class of approximations, whereby experience accumulated in habits substi-
tutes for the computational expense of searching goal-directed decision-trees. Building on previous work in
which we characterised Pavlovian pruning of decision trees, we explore more efficient approximations to
the planning problem. We focus on habit-like caching (’memoization’) of more complex action sequences
in dynamic, hierarchical decompositions of complex decision-trees.
Three groups of subjects performed a planning task. They first learned a transition matrix and then learned
about rewards associated with the transitions. They then produced choice sequences of a given length from
a random starting state to maximise their total earnings. Using reinforcement learning models nested inside
a Chinese Restaurant Process we infer subject’s hierarchical decomposition, stochastic memoization and
pruning strategies. Hierarchical decomposition and stochastic memoization models give detailed accounts
of complex features of choice data. We characterise how subjects dynamically establish subgoals; that their
decomposition strategy achieves a near optimal trade-off between computational costs and gains; that sub-
jects memoized and re-used complex choice sequences; that this correlated negatively with their ability to
search the tree; and replicated our previous findings whereby subjects disregard (prune) subtrees that lie
below large losses. We replicate all findings in two further datasets.
Humans employ multiple approximations when solving complex planning problems. They simplify the
problem by establishing sub-goals and decomposing the decision-tree around these in a manner that trades
computational costs for gains near optimally. They do not re-compute solutions on every trial but rather re-use previous solutions.

Details

show
hide
Language(s):
 Dates: 2013-10
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: -
 Degree: -

Event

show
hide
Title: 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013)
Place of Event: Princeton, NJ, USA
Start-/End Date: 2013-10-25 - 2013-10-27

Legal Case

show

Project information

show

Source 1

show
hide
Title: 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2013)
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: F55 Start / End Page: 40 Identifier: -