Planning with Information-Processing Constraints and Model Uncertainty in 
Markov Decision Processes

Grau-Moya, J; Leibfried, F; Genewein, T; Braun, DA

doi:10.1007/978-3-319-46227-1_30

Local TagsRelease HistoryDetailsSummary

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

Grau-Moya, J., Leibfried, F., Genewein, T., & Braun, D. (2016). Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes. In P. Frasconi, N. Landwehr, G. Manco, & J. Vreeken (Eds.), Machine Learning and Knowledge Discovery in Databases (pp. 475-491). Cham, Switzerland: Springer.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0000-7A78-1 Version Permalink: https://hdl.handle.net/21.11116/0000-0000-7A79-0

Genre: Conference Paper

Files

show Files

Locators

show

hide

Locator:
Link (Any fulltext) Open Access status unknown

Description:
-

OA-Status:

Creators

show

hide

Creators:
Grau-Moya, J^{1, 2}, Author
Leibfried, F^{2, 3}, Author
Genewein, T^{2, 3, 4}, Author
Braun, DA^{2, 3, 4}, Author

Affiliations:
1Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society, ou_1497647
2Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497809
3Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497794
4Research Group Sensorimotor Learning and Decision-making, Max Planck Institute for Intelligent Systems, Max Planck Society, ou_1688138

Content

show

hide

Free keywords: -

Abstract: Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.

Details

show

hide

Language(s):

Dates: Date issued: 2016-09

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: DOI: 10.1007/978-3-319-46227-1_30
BibTex Citekey: GrauMoyaLGB2016

Degree: -

Event

show

hide

Title: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML PKDD 2016)

Place of Event: Riva del Garda, Italy

Start-/End Date: -

Legal Case

show

Project information

show

Source 1

show

hide

Title: Machine Learning and Knowledge Discovery in Databases

Source Genre: Proceedings

Creator(s):
Frasconi, P., Editor
Landwehr, N., Editor
Manco, G., Editor
Vreeken, J., Editor

Affiliations:
-

Publ. Info: Cham, Switzerland : Springer

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 475 - 491 Identifier: ISBN: 978-3-319-46226-4

Source 2

show

hide

Title: Lecture Notes in Computer Science ; 9852

Source Genre: Series

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: - Identifier: -