English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

MPS-Authors
/persons/resource/persons84522

Grau-Moya,  J
Dept. Empirical Inference, Max Planck Institute for Intelligent Systems, Max Planck Society;
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons192683

Leibfried,  F
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84447

Genewein,  T
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Sensorimotor Learning and Decision-making, Max Planck Institute for Intelligent Systems, Max Planck Society;

/persons/resource/persons83827

Braun,  DA
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Sensorimotor Learning and Decision-making, Max Planck Institute for Intelligent Systems, Max Planck Society;

Locator

Link
(Any fulltext)

Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available
Citation

Grau-Moya, J., Leibfried, F., Genewein, T., & Braun, D. (2016). Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes. In P. Frasconi, N. Landwehr, G. Manco, & J. Vreeken (Eds.), Machine Learning and Knowledge Discovery in Databases (pp. 475-491). Cham, Switzerland: Springer.


Cite as: http://hdl.handle.net/21.11116/0000-0000-7A78-1
Abstract
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.