English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Reinforcement Learning with Simple Sequence Priors

Saanum, T., Éltetö, N., Dayan, P., Binz, M., & Schulz, E. (2024). Reinforcement Learning with Simple Sequence Priors. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Eds.), Advances in Neural Information Processing Systems 36: 37th Conference on Neural Information Processing Systems (NeurIPS 2023) (pp. 61985-62005). Red Hook, NY, USA: Curran.

Item is

Basic

show hide
Genre: Conference Paper

Files

show Files

Locators

show
hide
Description:
-
OA-Status:
Not specified

Creators

show
hide
 Creators:
Saanum, T1, Author           
Éltetö, N2, Author                 
Dayan, P2, Author                 
Binz, M1, Author                 
Schulz, E1, Author           
Affiliations:
1Research Group Computational Principles of Intelligence, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_3189356              
2Department of Computational Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_3017468              

Content

show
hide
Free keywords: -
 Abstract: In reinforcement learning (RL), simplicity is typically quantified on an action-by-action basis -- but this timescale ignores temporal regularities, like repetitions, often present in sequential strategies. We therefore propose an RL algorithm that learns to solve tasks with sequences of actions that are compressible. We explore two possible sources of simple action sequences: Sequences that can be learned by autoregressive models, and sequences that are compressible with off-the-shelf data compression algorithms. Distilling these preferences into sequence priors, we derive a novel information-theoretic objective that incentivizes agents to learn policies that maximize rewards while conforming to these priors. We show that the resulting RL algorithm leads to faster learning, and attains higher returns than state-of-the-art model-free approaches in a series of continuous control tasks from the DeepMind Control Suite. These priors also produce a powerful information-regularized agent that is robust to noisy observations and can perform open-loop control.

Details

show
hide
Language(s):
 Dates: 2024-05
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: -
 Degree: -

Event

show
hide
Title: Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)
Place of Event: New Orleans, LA, USA
Start-/End Date: 2023-12-10 - 2023-12-16

Legal Case

show

Project information

show

Source 1

show
hide
Title: Advances in Neural Information Processing Systems 36: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
Source Genre: Proceedings
 Creator(s):
Oh, A, Editor
Naumann, T, Editor
Globerson, A, Editor
Saenko, K, Editor
Hardt, M, Editor
Levine, S, Editor
Affiliations:
-
Publ. Info: Red Hook, NY, USA : Curran
Pages: - Volume / Issue: - Sequence Number: 2710 Start / End Page: 61985 - 62005 Identifier: -