Purchase this article with an account.
Jeremy Freeman, Eero Simoncelli; Crowding and metamerism in the ventral stream. Journal of Vision 2010;10(7):1347. doi: 10.1167/10.7.1347.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Vision is degraded in the periphery. The phenomenon of “crowding” provides a striking example: objects closer together than half their eccentricity are unrecognizable. Crowding has been described as statistical or textural averaging of features over spatial regions (Parkes et al., 2001), and recently Balas et al. (2009) showed that applying a texture analysis-synthesis model (Portilla & Simoncelli, 2000) to crowded stimuli simulates crowding effects. We develop this hypothesis with an explicit model of extrastriate ventral stream processing that performs eccentricity-dependent pooling across the entire visual field. Images are decomposed with V1-like filters, followed by simple and complex-cell-like nonlinearities. Pairwise products among V1 outputs are averaged within overlapping spatial regions that grow with eccentricity according to a single scaling parameter (ratio of size-to-eccentricity). If this model captures the information available to human observers, then two properly fixated images with identical model responses should be metamers. We perform experiments to determine the scaling parameter that produces metameric images. Given a natural image, we generate images that have identical model responses, but are otherwise as random as possible. We measure discriminability between such synthetic images as a function of scaling. When images are statistically matched within small pooling regions, performance is at chance (50%), despite substantial differences in the periphery. With larger pooling regions, peripheral differences increase, and discriminability approaches 100%. We fit the psychometric function to estimate the pooling regions (scaling) over which the observer estimates statistics. The result is consistent with the known eccentricity-dependence of crowding, and also with receptive field sizes in macaque mid-ventral areas, particularly V2. Finally, we show that metamers synthesized from classic crowding stimuli (e.g., groups of letters) yield images with jumbled, unidentifiable objects. Thus, the model associates the spatial extent of crowding with mid-ventral receptive field sizes, and provides specific hypotheses for the computations performed by underlying neural populations.
This PDF is available to Subscribers Only