User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse





Better Optimism By Bayes: Adaptive Planning with Rich Models

There are no MPG-Authors available
External Ressource
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Guez, A., Silver, D., & Dayan, P. (submitted). Better Optimism By Bayes: Adaptive Planning with Rich Models.

Cite as: http://hdl.handle.net/21.11116/0000-0004-BFA9-7
The computational costs of inference and planning have confined Bayesian model-based reinforcement learning to one of two dismal fates: powerful Bayes-adaptive planning but only for simplistic models, or powerful, Bayesian non-parametric models but using simple, myopic planning strategies such as Thompson sampling. We ask whether it is feasible and truly beneficial to combine rich probabilistic models with a closer approximation to fully Bayesian planning. First, we use a collection of counterexamples to show formal problems with the over-optimism inherent in Thompson sampling. Then we leverage state-of-the-art techniques in efficient Bayes-adaptive planning and non-parametric Bayesian methods to perform qualitatively better than both existing conventional algorithms and Thompson sampling on two contextual bandit-like problems.