hide
Free keywords:
-
Abstract:
The brains of human and nonhuman primates are thought to contain brain regions that have specialized for processing voice and faces. Although voice- and face-sensitive
regions have been primarily studied in their respective sensory modalities, recent human functional magnetic resonance imaging (fMRI) studies have suggested that
cross-modal interactions occur in these regions. Here, we investigated whether, and how, neuronal activity in a voice region is modulated by visual (face) stimulation. Using fMRI-guided electrophysiology, we targeted neurons in a voice-sensitive region in the right supra-temporal plane of two rhesus macaques. We used dynamic faces and voices of different human and monkey individuals for stimulation, including congruent and incongruent audiovisual pairs.
We observed robust non-additive visual influences of facial information on the auditory responses of neurons in this voice-sensitive region. In accordance with previous
studies, the direction of the audiovisual interactions seemed primarily determined by the phase of visually-evoked theta oscillations at auditory stimulus onset. Yet, we found
that, in addition, speaker-related stimulus features such as caller familiarity and identity and call type, studied within a multifactorial experimental design, differentially
modulated the crossmodal effects. In particular, familiar voices consistently elicited larger audiovisual influences than unfamiliar voices, despite auditory responses being
similar. Finally, we found neurons to be differentially sensitive to stimulus congruency: the specificity of audiovisual influences was disrupted when violating the congruency of a conspecific voice/face pairing by substituting the monkey face with a human face. In conclusion, our results describe the nature of the visual influences on neuronal responses in a voice-sensitive region in the primate brain. This study links to human
fMRI studies on multisensory influences in voice/face regions, provides insights on the neuronal cross-modal effects in these regions and hypothesizes that neurons at facesensitive regions might show comparable multisensory influences from the auditory domain.