# Item

ITEM ACTIONSEXPORT

Released

Journal Article

#### Generalized Thompson sampling for sequential decision-making and causal inference

##### External Resource

http://www.casmodeling.com/content/pdf/2194-3206-2-2.pdf

(Publisher version)

##### Fulltext (public)

There are no public fulltexts stored in PuRe

##### Supplementary Material (public)

There is no public supplementary material available

##### Citation

Ortega, P., & Braun, D. (2014). Generalized Thompson sampling for sequential decision-making
and causal inference.* Complex Adaptive Systems Modeling,* *2*(2),
1-23. doi:10.1186/2194-3206-2-2.

Cite as: http://hdl.handle.net/11858/00-001M-0000-0027-804C-8

##### Abstract

Purpose
Sampling an action according to the probability that the action is believed to be the optimal one is sometimes called Thompson sampling.
Methods
Although mostly applied to bandit problems, Thompson sampling can also be used to solve sequential adaptive control problems, when the optimal policy is known for each possible environment. The predictive distribution over actions can then be constructed by a Bayesian superposition of the policies weighted by their posterior probability of being optimal.
Results
Here we discuss two important features of this approach. First, we show in how far such generalized Thompson sampling can be regarded as an optimal strategy under limited information processing capabilities that constrain the sampling complexity of the decision-making process. Second, we show how such Thompson sampling can be extended to solve causal inference problems when interacting with an environment in a sequential fashion.
Conclusion
In summary, our results suggest that Thompson sampling might not merely be a useful heuristic, but a principled method to address problems of adaptive sequential decision-making and causal inference.