Next state prediction gives rise to entangled, yet compositional 
representations of objects

Saanum, T; Schulze Buschoff, LM; Dayan, P; Schulz, E

doi:10.48550/arXiv.2410.04940

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Preprint

Next state prediction gives rise to entangled, yet compositional representations of objects

MPG-Autoren

/persons/resource/persons263619

Saanum, T
Department of Computational Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons217460

Dayan, P
Department of Computational Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society;

Externe Ressourcen

https://arxiv.org/pdf/2410.04940
(beliebiger Volltext)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Saanum, T., Schulze Buschoff, L., Dayan, P., & Schulz, E. (submitted). Next state prediction gives rise to entangled, yet compositional representations of objects.

Zitierlink: https://hdl.handle.net/21.11116/0000-0010-0317-1

Zusammenfassung

Compositional representations are thought to enable humans to generalize across combinatorially vast state spaces. Models with learnable object slots, which encode information about objects in separate latent codes, have shown promise for this type of generalization but rely on strong architectural priors. Models with distributed representations, on the other hand, use overlapping, potentially entangled neural codes, and their ability to support compositional generalization remains underexplored. In this paper we examine whether distributed models can develop linearly separable representations of objects, like slotted models, through unsupervised training on videos of object interactions. We show that, surprisingly, models with distributed representations often match or outperform models with object slots in downstream prediction tasks. Furthermore, we find that linearly separable object representations can emerge without object-centric priors, with auxiliary objectives like next-state prediction playing a key role. Finally, we observe that distributed models' object representations are never fully disentangled, even if they are linearly separable: Multiple objects can be encoded through partially overlapping neural populations while still being highly separable with a linear classifier. We hypothesize that maintaining partially shared codes enables distributed models to better compress object dynamics, potentially enhancing generalization.