English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Generalized Thompson sampling for sequential decision-making and causal inference

Ortega, P., & Braun, D. (2014). Generalized Thompson sampling for sequential decision-making and causal inference. Complex Adaptive Systems Modeling, 2(2), 1-23. doi:10.1186/2194-3206-2-2.

Item is

Basic

show hide
Item Permalink: http://hdl.handle.net/11858/00-001M-0000-0027-804C-8 Version Permalink: http://hdl.handle.net/21.11116/0000-0001-2ABF-A
Genre: Journal Article

Files

show Files

Locators

show
hide
Description:
-

Creators

show
hide
 Creators:
Ortega, PA, Author              
Braun, DA1, 2, Author              
Affiliations:
1Research Group Sensorimotor Learning and Decision-Making, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497809              
2Max Planck Institute for Biological Cybernetics, Max Planck Society, Spemannstrasse 38, 72076 Tübingen, DE, ou_1497794              

Content

show
hide
Free keywords: -
 Abstract: Purpose Sampling an action according to the probability that the action is believed to be the optimal one is sometimes called Thompson sampling. Methods Although mostly applied to bandit problems, Thompson sampling can also be used to solve sequential adaptive control problems, when the optimal policy is known for each possible environment. The predictive distribution over actions can then be constructed by a Bayesian superposition of the policies weighted by their posterior probability of being optimal. Results Here we discuss two important features of this approach. First, we show in how far such generalized Thompson sampling can be regarded as an optimal strategy under limited information processing capabilities that constrain the sampling complexity of the decision-making process. Second, we show how such Thompson sampling can be extended to solve causal inference problems when interacting with an environment in a sequential fashion. Conclusion In summary, our results suggest that Thompson sampling might not merely be a useful heuristic, but a principled method to address problems of adaptive sequential decision-making and causal inference.

Details

show
hide
Language(s):
 Dates: 2014-03
 Publication Status: Published in print
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1186/2194-3206-2-2
BibTex Citekey: OrtegaB2014
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Complex Adaptive Systems Modeling
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: 2 (2) Sequence Number: - Start / End Page: 1 - 23 Identifier: -