Purchase this article with an account.
Ying-Zi Xiong, Nam-Anh Nguyen, Peggy Nelson, Gordon Legge; Visual Cues Reduce Spatial Uncertainty in Multi-Talker Situations. Journal of Vision 2021;21(9):2614. doi: https://doi.org/10.1167/jov.21.9.2614.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
In social situations where multiple talkers speak from different locations, tracking and segregating conversations can be challenging. We hypothesized that 1) the spatial uncertainty of the talker locations may contribute to listening difficulty; and 2) providing visual cues at the target talker location can facilitate speech recognition by reducing the spatial uncertainty. Subjects with normal hearing and vision (N = 22, 18 to 29 years) listened to simultaneously spoken sentences by three talkers with 10° or 20° separation, from different directions in the horizontal plane. In each trial, subjects attempted to repeat the sentence of a target talker indicated by a fixed starting word. The accuracy decreased from frontal (straight-ahead) to peripheral locations (left or right), and was higher with 20° separations. Word mislocation appeared to be the primary error. When the target was located at the center position among the three talkers, the mislocation error was the highest. When the target location was indicated by a brief visual pre-cue, the mislocation errors reduced and the accuracy increased. This cue benefit was only significant for the 20° separations. To model the effect of spatial uncertainty, subjects were asked to localize 200ms auditory noise (0.2-8 kHz) or visual white disks (3°) presented at random horizontal directions. Errors in localization (bias and precision) modeled by individual Gaussian functions represented spatial uncertainties for vision and audition at each azimuth. The probability of correctly locating the target talker in the multi-talker task was predicted from three auditory Gaussian functions corresponding to each talker direction. Visual cue effect was modelled by applying a visual Gaussian function to the three auditory Gaussians. This simple model provided close predictions for the multi-talker performance and visual cue benefit. Our empirical data and modeling approach revealed the important roles of both auditory and visual spatial uncertainty in multi-talker situations.
This PDF is available to Subscribers Only