Abstract
Humans are incredibly good and fast at recognizing places, or “scenes”. Not surprisingly then, there are cortical processes specialized for scene recognition; however, it remains unknown how humans recognize scenes from non-scene stimuli, such as faces and objects. Here, we hypothesize that, just like faces always have two eyes above a nose, above a mouth, there also exists some scene-defining visual features that enable the human brain to recognize scenes. To identify a potential scene-defining feature, we analyzed thousands of highly variable naturalistic scene images and found that, across most scenes, there is a vertical asymmetry in luminance, with the upper half brighter than the lower half. Next, we asked if this vertical luminance asymmetry (VLA) is not only a common scene feature, but also necessary to engage human visual scene processing. We predicted that if VLA is indeed necessary to engage scene processing, then a 90-degree image rotation that disrupts the VLA of a scene will impair scene recognition. Consistent with our hypothesis, we found people are worse at recognizing scenes that are rotated away from their upright, canonical orientations (90-degree, 180-degree, 270-degree rotation), while object recognition is unaffected by image rotation. Similarly, using functional magnetic resonance(fMRI), we found that the cortical scene processing system shows a diminished response to rotated scene images, whereas the cortical object processing system does not differentiate objects across different orientations. Taken together, these results provide converging stimuli-based, behavioral, and neural evidence that VLA is a scene-defining feature that enables the human brain to differentiate scenes from non-scene stimuli.