Purchase this article with an account.
Ha Hong, Ethan Solomon, Dan Yamins, James DiCarlo; Large-scale Characterization of a Universal and Compact Visual Perceptual Space. Journal of Vision 2014;14(10):912. doi: 10.1167/14.10.912.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Many visual psychophysics experiments hypothesize a perceptual space whose axes encode key features on which judgements are made. We characterized human perceptual space for an image set with 64,000 images of 64 objects, shown with differing positions, sizes, poses, and backgrounds. We performed online psychophysical experiments involving 703 observers, obtaining confusion matrices for 2016 two-alternative forced-choice (2AFC) pairwise object identification tasks. Generalizing Getty (1979) and Ashby (1991), we hypothesized that: (1) for each object, multiple image instances sample a Gaussian pointcloud in perceptual space; and (2) identity decisions could be modeled with distance-based classifiers applied to these Gaussian clouds. The dimension, locations, and spreads of the Gaussians were then chosen to be consistent with experimentally observed confusions. The resulting representation almost perfectly predicts confusions on held-out images and is stable to the addition of new objects. It also generalizes to visual tasks well beyond the original 2AFC task, predicting human responses for: (1) 8-way AFC recognition tasks, (2) ratings of objects with adjectives (e.g. "rectangular", "cuddly"), and (3) subjective similarity judgements between objects. The representation scales efficiently with object number, requiring ~47 dimensions to encode 10,000 distinct objects (Biederman, 1987). Given the scale and precision of the dataset, we were able to make direct comparisons to neural data. We found that the object layout in the inferred human perceptual space correlated highly with those from the neural population representation measured in Inferior Temporal (IT) cortex. Taken together, these results suggest that the human brain produces a visual perceptual space that is both universal (underlies behavior for many different tasks) and compact (requires few dimensions to represent many entities). We anticipate extensions of this method will further bridge neural and perceptual observations, and help characterize how interventions (e.g., learning and attention) modify perceptual representations.
Meeting abstract presented at VSS 2014
This PDF is available to Subscribers Only