Abstract
The purpose of this study was to assess how audiovisual (AV) multisensory stimuli is integrated in speech perception using behavioral and neurological measures. In particular, the McGurk effect is a robust phenomenon that occurs when a listener is presented with conflicting auditory and visual cues of a person speaking, resulting in the incorrect auditory perception of speech tokens corresponding to the visual stimulus. This effect demonstrates the influence of vision on hearing in the perception of speech. For the experiments, the McGurk effect was generated using combinations of audio and video stimuli corresponding to /fa/ and /ba/ speech tokens and presented to the subjects using the Duet Evoked Potential System with a Video Controller Module (Intelligent Hearing Systems, Miami, FL). The system allows precise synchronization and mixing of audio and video stimuli required to generate the McGurk effect and record evoked potentials. Behavioral measures were conducted to determine which auditory token was perceived by the subjects in combination with the matching (congruous) or conflicting (incongruous) videos. Behavioral results indicated that the video presented affects the perceived auditory token, confirming the McGurk effect in the subject population. An odd-ball paradigm (85% common to 15% odd) was used to record Event Related Potentials (ERPs) corresponding to auditory only, vision only and matching or conflicting AV multisensory stimulation. P300 response amplitudes and latencies were measured. As expected, the auditory and vision only stimulation generated corresponding ERPs to changes in common to odd presentations of each stimulus. Similarly, the combined AV presentation with the conflicting audio and video also generated a robust ERP response. The specific contributions of the perceived auditory and visual stimulus components in the overall ERP responses are studied analyzed and compared with other studies.