Abstract
People infer summary properties of scenes, but is this done by statistical aggregation of individuated objects (Alvarez, 2011; Ariely, 2001)? Or could image-level features that do not require any scene parsing—such as local texture (Portilla & Simoncelli, 2000; Balas, Nakano, Rosenholtz, 2009)—better explain the wide range of phenomena attributed to ensembles? If ensemble mean judgments harness parallel pre-attentive mechanisms operating on objects (Chong & Treisman, 2003), then the judgment should be independent of the number of objects being judged. This leads to the prediction that when comparing the mean sizes of objects in two sets, the number of objects in each set should not influence the mean sizes perceived. To test this we presented 6 participants with a two-alternative forced-choice task reporting which of the simultaneously presented sets of circles had the larger mean diameter. Each display could have Equal or Unequal set-sizes. Contrary to the object-based hypothesis, the more numerous set was generally judged to have a larger diameter, biasing the point of subjective equality (PSE) (t(5)=4.96, p=0.004). While a low-level cue to total area (e.g., luminance in each hemifield) may be sufficient for judgments between equal set-sizes, it does not explain the current data with unequal set-sizes, as this model overestimates participants' PSE shifts (t(5)=-5.35, p=0.003). Finally, we considered whether performance differences between Equal and Unequal trials could arise from flexibly selecting different strategies. We conclude that this is unlikely because participants' psychometric curves were consistent across blocked (Equal-only or Unequal-only) and intermixed runs. Therefore, we argue that a mid-level representation such as local texture is needed to capture these patterns of behavior in tasks meant to elicit mean size computations. Importantly, these rich summary statistics might encapsulate scene gist—and allow ensemble tasks to be performed well—without representing or measuring objects at all.
Meeting abstract presented at VSS 2016