On the forking paths of exactitude

Bányai, M; Dayan, P

Lokale TagsFreigabegeschichteDetailsÜbersicht

On the forking paths of exactitude

Bányai, M., & Dayan, P. (2022). On the forking paths of exactitude. Poster presented at 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2022), Providence, RI, USA.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/21.11116/0000-000A-8251-7 Versions-Permalink: https://hdl.handle.net/21.11116/0000-000A-8255-3

Genre: Poster

Dateien

einblenden: Dateien

Externe Referenzen

einblenden:

ausblenden:

externe Referenz:
https://rldm.org/wp-content/uploads/2021/04/program.pdf (Zusammenfassung) Open Access Status unbekannt

Beschreibung:
-

OA-Status:

Urheber

einblenden:

ausblenden:

Urheber:
Bányai, M¹, Autor
Dayan, P¹, Autor

Affiliations:
1Department of Computational Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_3017468

Inhalt

einblenden:

ausblenden:

Schlagwörter: -

Zusammenfassung: How inputs are represented is critical for performance in decision-making problems since it determines how superficial distinctions
are discarded or parametrically suppressed. It is thus a central facet in RL, and also a focus of human and animal
behavioural neuroscience. Superficiality depends on what a decision-maker currently knows and, most critically, what they
expect to find out next - as an aggregation at one point in learning can affect potential disaggregations at later points. Thus,
the optimal representation at any particular juncture is neither that which compactly summarises past observations nor that
which supports the ultimately optimal policy. Here, we analyze this problem, showing that decision-makers need to plan in
the space of possible future representations in the same careful way they balance exploration/exploitation of actions - for
instance, via value estimation in a tree spanning possible future belief states and representations. In a contextual bandit in
which states can optimally be aggregated into discrete abstractions, a representational trajectory corresponds to the temporal
order by which finer or coarser-grained distinctions are made between different state space regions. We show how the
optimal representational trajectory depends on the discount factor in addition to the belief state, predicting that the same
series of observations should lead to different representational refinements at different discounting values. We show that
representational coarse-graining is similarly beneficial for decision-makers who are only approximately Bayesian, using their
representations to encode their beliefs about the reward structure of the environment. In sum, representational planning
provides a general and flexible framework for modelling human statistical learning and decision making, for principled evaluation
of heuristics, and for making predictions about both behavioural and neural signatures of learning.

Details

einblenden:

ausblenden:

Sprache(n):

Datum: Online veröffentlicht: 2022-05

Publikationsstatus: Online veröffentlicht

Seiten: -

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: -

Art des Abschluß: -

Veranstaltung

einblenden:

ausblenden:

Titel: 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2022)

Veranstaltungsort: Providence, RI, USA

Start-/Enddatum: 2022-06-08 - 2022-06-11

ausblenden:

Titel: 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2022)

Genre der Quelle: Konferenzband

Urheber:

Affiliations:

Ort, Verlag, Ausgabe: -

Seiten: - Band / Heft: - Artikelnummer: 1.22 Start- / Endseite: 23 Identifikator: -

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle 1