Abstract
When we recognize an object, we automatically know how big it is in the world (Konkle & Oliva, 2012). Here we asked whether this automatic activation relies on explicit recognition at the basic level category of the object, or whether it can be triggered by mid-level visual features. To explore this question, we gathered images of big and small objects (e.g. car, shoe), and then generated texture stimuli by coercing white noise to match the mid-level image statistics of the original objects (Freeman & Simoncelli, 2011). Behavioral ratings confirmed that these textures were unidentifiable at the basic-level (N=30, 2.8% SD: 4%). In Experiment 1, participants made a speeded judgment about which of two textures was visually bigger or smaller on the screen. Critically, the visual sizes of the textures were either congruent or incongruent with real-world sizes of the original images. Participants were faster at judging the visual size of the texture when it’s original size was congruent (M=504 ms) vs. incongruent (M=517 ms) with it’s size on the screen (t(15)=3.79, p< .01). This result suggests that these texture stimuli preserve shape features that are diagnostic of real-world size and automatically activate this association. Consistent with this interpretation, we found that a new set of observers could correctly classify these textures as big or small in the real-world at a rate slightly above chance (N=30, small objects, 63.2%, big objects: 56.4%), and that the magnitude of the Stroop effect was greater when judging textures that were more consistently associated with big or small real-world sizes (F(1, 23)=38, p< 0.001). Taken together, these results suggest that mid-level visual features are sufficient to automatically activate real-world size information.
Meeting abstract presented at VSS 2015