August 2016
Volume 16, Issue 12
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2016
A causal inference model of multisensory speech perception provides an explanation for why some audiovisual syllables but not others produce the McGurk Effect
Author Affiliations
  • John Magnotti
    Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine
  • Michael Beauchamp
    Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine
Journal of Vision September 2016, Vol.16, 580. doi:https://doi.org/10.1167/16.12.580
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      John Magnotti, Michael Beauchamp; A causal inference model of multisensory speech perception provides an explanation for why some audiovisual syllables but not others produce the McGurk Effect. Journal of Vision 2016;16(12):580. https://doi.org/10.1167/16.12.580.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Audiovisual speech integration combines information from auditory speech (talker's voice) and visual speech (talker's mouth movements) to improve perceptual accuracy. However, if the auditory and visual speech emanate from different talkers, integration decreases accuracy. Therefore, a key step in audiovisual speech perception is deciding whether auditory and visual speech have the same cause, a process known as causal inference. A primary cue for this decision is the disparity between the auditory and visual speech content, with lower disparity indicating a single cause. A well-known multisensory illusion, the McGurk Effect, consists of incongruent audiovisual speech, such as auditory "ba" + visual "ga" (AbaVga), that is integrated to produce a fused percept ("da"). This illusion raises at least two questions: first, given the disparity between auditory and visual speech, why are they integrated; and second, why does the McGurk Effect occur for some syllables (e.g., AbaVga) but not other, ostensibly similar, syllables (e.g., AgaVba). We describe a Bayesian model of causal inference in multisensory speech perception (CIMS2) that calculates the percept resulting from assuming common vs. separate causes; computes the likelihood of common vs. separate causes using content disparity; averages the common and separate cause percepts weighted by their likelihood; and finally applies a decision rule to categorize the averaged percept. We apply this model to behavioral data collected from 265 subjects perceiving two incongruent speech stimuli, AbaVga and AgaVba. The CIMS2 model successfully predicted the integration (McGurk Effect) observed when human subjects were presented with AbaVga and the lack of integration (no McGurk Effect) for AgaVba. Without the causal inference step, the model predicted integration for both stimuli. Our results demonstrate a fundamental role for causal inference in multisensory speech perception, and provide a computational framework for studying speech perception in conditions of varying audiovisual disparity.

Meeting abstract presented at VSS 2016

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×