August 2016
Volume 16, Issue 12
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2016
Lip Movements Amplify Correlated Spectral Contours in Speech
Author Affiliations
  • John Plass
    Psychology Department, Northwestern University
  • Marcia Grabowecky
    Psychology Department, Northwestern University
  • Satoru Suzuki
    Psychology Department, Northwestern University
Journal of Vision September 2016, Vol.16, 579. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      John Plass, Marcia Grabowecky, Satoru Suzuki; Lip Movements Amplify Correlated Spectral Contours in Speech. Journal of Vision 2016;16(12):579.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Viewing articulatory lip movements can improve the detection and comprehension of congruent auditory speech. However, the specific audiovisual correspondences that underlie these crossmodal facilitation effects are largely unknown. We hypothesized that the perceptual system may exploit reliable natural relationships between articulatory lip motion and speech acoustics. In particular, analysis of four speakers (two female) revealed a strong correlation (all R2 > .9) between the horizontal width of the oral aperture and the height of the frequency of the second formant as the speakers expanded and contracted their lips to pronounce the syllables /wi/ and /yu/ (APA transcription). To test whether this dynamic relationship between mouth aspect ratio and second-formant frequency underlies crossmodal facilitation of speech perception, we produced artificial stimuli that reproduced the audiovisual relationship. Visual stimuli were dark ellipses on a light background whose width expanded or contracted over 350 ms, approximating the sigmoidal change in mouth width when people pronounce /wi/ and /yu/. We verified that these ellipses were not perceived as mouths. Auditory stimuli were 100-Hz-wide bandpass-filtered white noise whose mean frequency rose or fell between 500 and 3000 Hz, approximating the second-formant frequency change. Using a bias-free 2IFC auditory-detection task (in noise), we estimated 18 participants' detection thresholds for the frequency sweeps while they viewed ellipses with correlated horizontal motion, anti-correlated horizontal motion, or no motion (using interleaved QUEST staircases). To rule out the possibility of a more general non-speech-specific mechanism, we also included conditions in which ellipses expanded or contracted vertically. As predicted, only correlated horizontal motion significantly decreased auditory detection thresholds relative to the static control. Anti-correlated and vertical motion produced no reliable changes. Together, these results suggest that the perceptual system exploits natural relationships between articulatory lip motion and vocal acoustics to facilitate speech perception.

Meeting abstract presented at VSS 2016


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.