October 2020
Volume 20, Issue 11
Open Access
Vision Sciences Society Annual Meeting Abstract  |   October 2020
Cross-modal suppression model of speech perception: Visual information drives suppressive interactions between visual and auditory speech in pSTG
Author Affiliations & Notes
  • Brian A., Metzger
    Baylor College of Medicine
  • John F., Magnotti
    Baylor College of Medicine
  • Elizabeth Nesbitt
    Baylor College of Medicine
  • Daniel Yoshor
    Baylor College of Medicine
  • Michael S., Beauchamp
    Baylor College of Medicine
  • Footnotes
    Acknowledgements  NIH R01NS06395, U01NS098976 and R25NS070694
Journal of Vision October 2020, Vol.20, 434. doi:https://doi.org/10.1167/jov.20.11.434
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Brian A., Metzger, John F., Magnotti, Elizabeth Nesbitt, Daniel Yoshor, Michael S., Beauchamp; Cross-modal suppression model of speech perception: Visual information drives suppressive interactions between visual and auditory speech in pSTG. Journal of Vision 2020;20(11):434. https://doi.org/10.1167/jov.20.11.434.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Human speech consists of visual information from the talker’s mouth and auditory information from the talker’s voice. A key question is whether the neural computations that integrate visual and auditory speech are additive, superadditive (excitatory) or subadditive (suppressive). To answer this question, we recorded brain activity from 7 patients implanted with electrodes for the treatment of medically-intractable epilepsy. We examined 33 intracranial electrodes (iEEG) located over the posterior superior temporal gyrus (pSTG), a key brain area for multisensory speech perception. Patients listened to audiovisual speech words in three formats: natural asynchrony between auditory and visual speech onset, auditory speech onset advanced 300 ms (A300V), and visual speech onset advanced 300 ms (V300A). We used deconvolution to decompose the measured iEEG responses to audiovisual speech into unisensory auditory and visual speech responses. Manipulating the asynchrony of the auditory and visual speech allowed us to separately estimate the responses to auditory and visual speech, and hence the rule by which their neural responses were combined. The deconvolved esponses were then fit to two models. The additive model sums the deconvolved unisensory responses and was a poor fit to the actual data (RMSE=41). The non-additive model sums the deconvolved unisensory responses plus an auditory-visual interaction term. The non-additive model was a better fit to the actual data (RMSE=20). We also examined the sign of the interaction term. A positive interaction indicates a measured response greater than the summed unisensory responses (superadditivity), while a negative interaction indicates a measured response less than the summed unisensory responses (subadditivity). The interaction was negative for 25 of 33 electrodes. These data indicate a suppressive interaction between visual and auditory speech information consistent with a cross-modal suppression model of speech perception in which early arriving visual speech information inhibits the responses of neurons selective for incompatible auditory phonemes.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×