Abstract
We recognize people based on cues from different modalities, such as faces and voices, but the brain mechanisms underpinning their integration are not fully understood. One proposal is that multisensory information is integrated in dedicated multimodal regions. In two fMRI studies, we aimed to identify brain regions that respond to both faces and voices, and characterize their responses. All participants completed two runs of three functional localizers: visual (silent videos of non-speaking people, and scenes), auditory (voice recordings of people, and environmental sounds), and audiovisual (videos with speaking people, and scenes with respective sounds). Using data from study 1 (N = 30), we conducted a conjunction analysis to identify the multimodal regions. We considered a region multimodal if it responded more to faces, and voices, and people speaking, than to respective control stimuli. The only brain region that consistently showed people-specific activation (24 out of 30 participants) was located in right posterior STS. In study 2 (N = 12, data collection ongoing), we divided each participant’s data in two halves. One half was used to define face-selective, voice-selective, and people-specific multimodal regions, and the other half was used to extract mean activation and response patterns in these regions. The people-specific multimodal region in right posterior STS responded significantly more to audiovisual stimuli than to just faces or voices, and it responded significantly more to voices than to faces. We then extracted multivoxel response patterns from this region. While face-responsive patterns correlated moderately with voice-responsive patterns, the correlations were significantly higher between the face- or voice-responsive patterns and the multimodal people-specific patterns. These results suggest that not all voxels in the people-specific multimodal posterior STS respond to faces and voices similarly. In sum, the two studies allowed to identify the region in posterior STS that shows consistent people-specific activation across modalities.
Acknowledgement: The study is supported by Leverhulme Trust research grant to Lúcia Garrido (RPG-2014-392).