Abstract
Extensive research indicates that visual cues are integrated with auditory information to influence speed perception (e.g., the McGurk effect). In a separate line of work, researchers have recently reported that hand gestures are an effective tool to enhance auditory perception during second-language (L2) learning in natural classroom environments. This is particularly true in languages like Japanese, where long and short vowel durations convey meaning in a way that does not exist in English. Here, we investigate whether multisensory integration is the mechanism behind this phenomenon. To remove all co-occurring cues in a natural teaching environment other than the visual gestures, we designed a digital avatar instructor to deliver a 40-minute computer-assisted language training session. Twenty-nine native English speakers enrolled in the beginning or intermediate Japanese were randomly assigned to receive instruction from either a gesturing or stationary avatar, where the inclusion of gestures was the only difference between each. Eye-tracking was performed during training to measure how subjects attended to different elements of the digital environment: they may attend to the avatar’s moving mouth (matched with the audio), silent hand gestures signaling the length of the vowel sounds (in the gesture condition), or a blackboard displaying the written target word. Both a perception task and a pronunciation task were given prior to and after training to evaluate the gesture effect. For both the gesture and non-gesture groups, the digital training equally improved perception across all subjects. However, the eye-tracking data showed overtly attending to the gestures when present positively correlated with performance on the perception task. Conversely, attending the blackboard showed the greatest benefit for the non-gesture group. Our study suggests that visual cues may be integrated with auditory signals to disambiguate speed perception during second language acquisition. Furthermore, this effect is modulated by the learner’s strategic attention deployments.