September 2018
Volume 18, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2018
A Texture Representation Account of Ensemble Perception
Author Affiliations
  • Sasen Cain
    Department of Psychology, University of California San Diego
  • Matthew Cain
    Natick Soldier Research, Development, & Engineering Center, U.S. ArmyCenter for Applied Brain & Cognitive Sciences, Tufts University
Journal of Vision September 2018, Vol.18, 618. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Sasen Cain, Matthew Cain; A Texture Representation Account of Ensemble Perception. Journal of Vision 2018;18(10):618. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Can multiscale image statistics (Portilla & Simoncelli, 2000) explain ensemble perception phenomena better than feedforward object recognition? In the conventional view, objects' properties are rapidly measured and averaged to provide scene gist information (Alvarez, 2011), but this mechanism doesn't fit patterns of human performance. We showed (Cain, Dobkins, Vul, VSS 2016) that mean circle size judgments were systematically biased when comparing different numbers of items: participants selected the display with more items as larger, incurring robust point of subjective equality (PSE) shifts. Others have noted this perturbation (Chong & Treisman, 2005; Sweeny, et al., 2014), yet maintain the conventional view. We replicated the experiment and developed a computational model of how texture statistics could—without individuating or measuring objects—explain both the successes and failures of human ensemble perception. We trained linear support vector machines (SVMs) with three feature sets that reflect increasing amounts of image structure: pixel statistics (multiscale luminance properties, 16 features), marginal statistics (pixel statistics + autocorrelations, 421 features), full texture statistics (marginal statistics + crosscorrelations, 2456 features). The 48 easiest Equal set-size trials formed the training set; on each 2AFC trial, we computed the difference in these statistics between the two displays. Each SVM's classifications on the remaining 768 trials were used to fit its psychometric function. Compared to humans' PSE shifts, the pixel SVM's PSE shift was too extreme (a 2:1 ratio, as predicted), while both higher-order statistics (marginal and full) matched humans' PSE shifts. Because the marginal and full SVMs responded identically, the additional crosscorrelation features are unnecessary for explaining human behavior on this task, while the autocorrelations of the marginal statistics are crucial. Our ideal observers successfully reproduce robust human biases on an ensemble mean task—without explicit object representation. This texture representation approach could be applied to arbitrary scene stimuli to more parsimoniously explain parallel preattentive processing.

Meeting abstract presented at VSS 2018


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.