Optimism and Pessimism in Optimised Replay

Antonov, G; Gagne, C; Eldar, E; Dayan, P

doi:10.1101/2021.04.27.441454

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Journal Article

Optimism and Pessimism in Optimised Replay

MPS-Authors

/persons/resource/persons252285

Antonov, G
Department of Computational Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons245568

Gagne, C
Department of Computational Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons217460

Dayan, P
Department of Computational Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

External Resource

https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1009634&type=printable
(Supplementary material)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Antonov, G., Gagne, C., Eldar, E., & Dayan, P. (2022). Optimism and Pessimism in Optimised Replay. PLoS Computational Biology, 18(1). doi:10.1101/2021.04.27.441454.

Cite as: https://hdl.handle.net/21.11116/0000-0008-6DD5-E

Abstract

The replay of task-relevant trajectories is known to contribute to memory consolidation and improved task performance. A wide variety of experimental data show that the content of replayed sequences is highly specific and can be modulated by reward as well as other prominent task variables. However, the rules governing the choice of sequences to be replayed still remain poorly understood. One recent theoretical suggestion is that the prioritization of replay experiences in decision-making problems is based on their effect on the choice of action. We show that this implies that subjects should replay sub-optimal actions that they dysfunctionally choose rather than optimal ones, when, by being forgetful, they experience large amounts of uncertainty in their internal models of the world. We use this to account for recent experimental data demonstrating exactly pessimal replay, fitting model parameters to the individual subjects’ choices.