Abstract
This research involves an investigation of the cognitive mechanisms underlying audiovisual integration efficiency in speech perception. Speech recognition is a multimodal process engaging both auditory and visual modalities (McGurk and MacDonald, 1976; Sumby and Pollack, 1954). In a pioneering study, Sumby and Pollack (1954) demonstrated that lip-reading, or visual speech, plays a crucial role in speech recognition by enhancing accuracy across multiple auditory signal-to-noise ratios. Although traditional accuracy-only models of audiovisual integration, such as the Fuzzy Logical Model of Perception (Massaro, 2004) and Braida's Pre-Labeling Model (Braida, 1991), can adequately predict audiovisual recognition scores and integration efficiency (Grant, Walden, and Seitz, 1998), they fail to specify the real-time dynamic mechanisms behind integration. The limitations of traditional modeling approaches thus motivated the use of (non-parametric) statistical and experimental tools in a series of recent studies (Altieri, 2010). Altieri (2010) utilized a reaction time and information processing measure known as “workload capacity” (Townsend and Nozawa, 1995) to quantify integration efficiency/multisensory benefit in speech perception. The capacity measure was used to compare (transformed) reaction time distributions obtained from the audiovisual condition, to the auditory-only and visual-only reaction times in speeded speech discrimination tasks. Three auditory signal-to-noise ratios were employed. The results revealed that efficient audiovisual integration, measured by a workload capacity coefficient greater than 1, was only observed for low auditory signal-to-noise ratios. New experiments using combined ERP and reaction time methods are being implemented to assess how brain signals relate to behavioral and information processing measures, including capacity. Preliminary data analyses indicate increased suppression of the audiovisual ERP waveform relative to the auditory-only ERP signal in frontal and left parietal/temporal regions as the auditory signal-to-noise ratio decreases, and integration efficiency increases. Benefits of combined EEG/reaction time studies include obtaining generalized neural and behavioral measures of integration efficiency for speech and non-speech stimuli.