Help Privacy Policy Disclaimer
  Advanced SearchBrowse





Approximate Planning from Better Bounds on Q

There are no MPG-Authors in the publication available
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Sezener, C., & Dayan, P. (2017). Approximate Planning from Better Bounds on Q. Poster presented at 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2017), Ann Arbor, MI, USA.

Cite as: https://hdl.handle.net/21.11116/0000-0004-DAE0-9
Planning problems are often solved approximately using simulation based methods such as Monte
Carlo Tree Search (MCTS). Indeed, UCT, perhaps the most popular MCTS algorithm, lies at the heart of
many successful applications. However, UCT is fundamentally inefficient as a planning algorithm, since it
is not focused exclusively on the value of the action that is ultimately chosen. Accordingly, even as simple
a modification to UCT as accounting for myopic information values at the root of the search tree can result
in significant performance improvements. Here, we propose a method that extends value of information-
like computations to arbitrarily many nodes of the search tree for simple acyclic MDPs. We demonstrate significant performance improvements over other planning algorithms.