Help Privacy Policy Disclaimer
  Advanced SearchBrowse





Forgetful Inference in a Sophisticated World Model

There are no MPG-Authors in the publication available
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Ahilan, S., Solomon, R., Conover, K., Niyogi, R., Shizgal, P., & Dayan, P. (2017). Forgetful Inference in a Sophisticated World Model. Poster presented at 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2017), Ann Arbor, MI, USA.

Cite as: https://hdl.handle.net/21.11116/0000-0004-DADE-D
Humans and other animals are able to discover underlying statistical structure in their environ-
ments and exploit it to achieve efficient and effective performance. However, the largest scale structures
such as ‘world models’ are often difficult to learn and use because they are obscure, involving long-range temporal dependencies. Here, we analyzed behavioral data from a lengthy experiment with rats, showing
that subjects discovered such hidden structure, using it to respond more quickly to rewarding states whilst
responding more slowly, or not at all, to unrewarding states. We also identified surprising occasions where
subjects responded rapidly to unrewarding states, despite the structure of the task seemingly having been
learned. We attributed these instances to immediate inferential imperfections caused by the partial observ-
ability of hidden states. To describe this process statistically, we built a hidden Markov model (HMM) of
the subjects’ models of the experiment, describing overall behavior as integrating recent observations with
the recollections of an imperfect memory. Over the course of training, we found that subjects came to
track their progress through the task more accurately, indicating an improved ability to infer state. Model
fits attributed this improvement to decreased forgetting of the previous state. This ‘learning to remember’
decreased reliance on more recent observations, which can be misleading, in favor of a more dependable