Abstract
Multimodal perception has been previously investigated using simple stimuli such as pure tones and geometric forms or light flashes. It is generally found that presentation of bimodal stimuli improves behavioural performance, whether in detection, localisation or identification tasks. However, little is known about the multimodal integration of biologically relevant stimuli such as faces and voices. The purpose of this study was to determine the time-course of face and voice perception, depending on the attended modality. Nineteen subjects performed a gender categorisation on congruent or incongruent bimodal stimuli with attention directed to one or the other modality, i.e. to the faces or to the voices. ERPs were recorded concurrently. Behavioural data showed that gender categorisation was faster for faces than voices. Incongruent information in the unattended modality decreased accuracy and prolonged RTs compared to the congruent condition. ERPs were dominated by the response to the faces. Brain topography analyses showed a larger activity when attention was directed to faces around 100 ms after stimulus onset, equivalent to the P1 latency. However, the face-specific ERP component, N170, was not sensitive to the direction of attention.
These data showed that directing attention to a particular sensory modality modulates early processing of visuo-auditory information, but not the N170. Further analyses will clarify the way in which attention affects the processing of bimodal stimuli. It is currently debated whether the N170 is sensitive to top-down effects or not. Based on these results, we suggest that neural mechanisms underlying the N170 are automatically recruited by faces.