Approximate Planning from Better Bounds on Q

Sezener, CE; Dayan, P

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Poster

Approximate Planning from Better Bounds on Q

MPS-Authors

There are no MPG-Authors in the publication available

External Resource

http://rldm.org/wp-content/uploads/2017/06/RLDM17AbstractsBooklet.pdf
(Abstract)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Sezener, C., & Dayan, P. (2017). Approximate Planning from Better Bounds on Q. Poster presented at 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2017), Ann Arbor, MI, USA.

Cite as: https://hdl.handle.net/21.11116/0000-0004-DAE0-9

Abstract

Planning problems are often solved approximately using simulation based methods such as Monte
Carlo Tree Search (MCTS). Indeed, UCT, perhaps the most popular MCTS algorithm, lies at the heart of
many successful applications. However, UCT is fundamentally inefficient as a planning algorithm, since it
is not focused exclusively on the value of the action that is ultimately chosen. Accordingly, even as simple
a modification to UCT as accounting for myopic information values at the root of the search tree can result
in significant performance improvements. Here, we propose a method that extends value of information-
like computations to arbitrarily many nodes of the search tree for simple acyclic MDPs. We demonstrate significant performance improvements over other planning algorithms.