Vigor in the face of fluctuating rates of reward: An experimental examination

Guitart-Masip, M; Beierholm, UR; Dolan, RJ; Düzel, E; Dayan, P

doi:10.1162/jocn_a_00090

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Journal Article

Vigor in the face of fluctuating rates of reward: An experimental examination

MPS-Authors

There are no MPG-Authors in the publication available

External Resource

https://www.mitpressjournals.org/doi/pdf/10.1162/jocn_a_00090
(Publisher version)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Guitart-Masip, M., Beierholm, U., Dolan, R., Düzel, E., & Dayan, P. (2011). Vigor in the face of fluctuating rates of reward: An experimental examination. Journal of Cognitive Neuroscience, 23(12), 3933-3938. doi:10.1162/jocn_a_00090.

Cite as: https://hdl.handle.net/21.11116/0000-0002-C7DB-7

Abstract

Two fundamental questions underlie the expression of behavior, namely what to do and how vigorously to do it. The former is the topic of an overwhelming wealth of theoretical and empirical work particularly in the fields of reinforcement learning and decision-making, with various forms of affective prediction error playing key roles. Although vigor concerns motivation, and so is the subject of many empirical studies in diverse fields, it has suffered a dearth of computational models. Recently, Niv et al. [Niv, Y., Daw, N. D., Joel, D., & Dayan, P. Tonic dopamine: Opportunity costs and the control of response vigor. Psychopharmacology (Berlin), 191, 507–520, 2007] suggested that vigor should be controlled by the opportunity cost of time, which is itself determined by the average rate of reward. This coupling of reward rate and vigor can be shown to be optimal under the theory of average return reinforcement learning for a particular class of tasks but may also be a more general, perhaps hard-wired, characteristic of the architecture of control. We, therefore, tested the hypothesis that healthy human participants would adjust their RTs on the basis of the average rate of reward. We measured RTs in an odd-ball discrimination task for rewards whose magnitudes varied slowly but systematically. Linear regression on the subjects' individual RTs using the time varying average rate of reward as the regressor of interest, and including nuisance regressors such as the immediate reward in a round and in the preceding round, showed that a significant fraction of the variance in subjects' RTs could indeed be explained by the rate of experienced reward. This validates one of the key proposals associated with the model, illuminating an apparently mandatory form of coupling that may involve tonic levels of dopamine.