English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

Grau-Moya, J., Leibfried, F., Genewein, T., & Braun, D. (2016). Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes. In P. Frasconi, N. Landwehr, G. Manco, & J. Vreeken (Eds.), Machine Learning and Knowledge Discovery in Databases (pp. 475-491). Cham, Switzerland: Springer.

Item is

Basic

show hide
Genre: Conference Paper

Files

show Files

Locators

show
hide
Locator:
Link (Any fulltext)
Description:
-
OA-Status:

Creators

show
hide
 Creators:
Grau-Moya, J1, 2, Author           
Leibfried, F2, 3, Author           
Genewein, T2, 3, 4, Author           
Braun, DA2, 3, 4, Author           
Affiliations:
1Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society, ou_1497647              
2Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497809              
3Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497794              
4Research Group Sensorimotor Learning and Decision-making, Max Planck Institute for Intelligent Systems, Max Planck Society, ou_1688138              

Content

show
hide
Free keywords: -
 Abstract: Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.

Details

show
hide
Language(s):
 Dates: 2016-09
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1007/978-3-319-46227-1_30
BibTex Citekey: GrauMoyaLGB2016
 Degree: -

Event

show
hide
Title: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML PKDD 2016)
Place of Event: Riva del Garda, Italy
Start-/End Date: -

Legal Case

show

Project information

show

Source 1

show
hide
Title: Machine Learning and Knowledge Discovery in Databases
Source Genre: Proceedings
 Creator(s):
Frasconi, P., Editor
Landwehr, N., Editor
Manco, G., Editor
Vreeken, J., Editor
Affiliations:
-
Publ. Info: Cham, Switzerland : Springer
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 475 - 491 Identifier: ISBN: 978-3-319-46226-4

Source 2

show
hide
Title: Lecture Notes in Computer Science ; 9852
Source Genre: Series
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: - Identifier: -