Abstract
Humans use visual speech information from a talker's mouth movements to complement auditory information from the talker's voice. Recently, we discovered individual differences in eye movements during viewing of talking faces: some observers mainly fixate the mouth of the talker, while others mainly fixate the eyes. We tested the hypothesis that mouth-lookers would make better use of visual speech in 34 participants. In experiment 1, participants viewed clear audiovisual syllables. A median split of the eye-tracking data was used to classify participants as mouth-lookers (81% of trial-time spent fixating the mouth) and eye-lookers (45%). In experiment 2, participants repeated noisy auditory sentences presented alone or paired with visual speech. An ANOVA on the number of words accurately repeated showed main effects of condition (higher accuracy for audiovisual than auditory speech, F=234, p=10-15) and group (higher accuracy for mouth-lookers, F=5, p=0.03). Critically, there was a significant interaction, with mouth-lookers showing a greater improvement in accuracy when visual speech was presented (F=7, p=0.01). Given the higher acuity of foveal vision, fixating the talker's mouth might be expected to provide more visual speech information. To assess this possibility, we examined the eye movements made by the participants during experiment 2. Both mouth-lookers and eye-lookers almost exclusively fixated the mouth (94% vs. 92% mouth fixation time, p=0.53) consistent with previous demonstrations that noisy auditory speech drives mouth fixation. The propensity to fixate the mouth of the talker even when it is not necessary (during perception of clear audiovisual speech) is linked to improved perception under noisy conditions in which mouth movements are critical for understanding speech. We speculate that although all humans have extensive experience with talking faces, the additional time that mouth-lookers spend examining the mouth leads to greater expertise in extracting visual speech features.
Meeting abstract presented at VSS 2018