hide
Free keywords:
-
Abstract:
Animals face uncertainty about their environments due to initial ignorance or subsequent changes. They therefore need to explore. However, the algorithmic structure of exploratory choices in the brain still remains largely elusive. Artificial agents face the same problem, and a venerable idea in reinforcement learning is that they can plan appropriate exploratory choices offline, during the equivalent of quiet wakefulness or sleep. Although offline processing in humans and other animals, in the form of hippocampal replay and preplay, has recently been the subject of highly informative modelling, existing methods only apply to known environments. Thus, they cannot predict exploratory replay choices during learning and/or behaviour in the face of uncertainty. Here, we extend an influential theory of hippocampal replay and examine its potential role in approximately optimal exploration, deriving testable predictions for the patterns of exploratory replay choices in a paradigmatic spatial navigation task. Our modelling provides a normative interpretation of the available experimental data suggestive of exploratory replay. Furthermore, we highlight the importance of sequence replay, and license a range of new experimental paradigms that should further our understanding of offline processing.