Abstract
The differentiation between things and stuff, man-made and natural scenes or different scene complexities have been previously identified as key components of scene appearance. However, there remains uncertainty around how to categorize these aspects of scenes using image-based metrics. Groen et al. (Journal of Neuroscience, 2013) found that natural and man-made image content could be loosely characterized using two dimensions: spatial coherence (variation in edge density such that low variation means high coherence) and contrast energy (average local contrast). Here, we find that these statistics can similarly differentiate images in the THINGS and STUFF databases. While the two databases are not perfectly separated, images from THINGS tend to have high spatial coherence and high contrast energy (scene-like), where as images from STUFF tend to have low spatial coherence and low contrast energy (texture-like). To test whether variation in these statistics is correlated with differences in perceptual processing, we examined human sensitivity to Eidolon distortions in sets of images from each of these two quadrants, independent of their database membership. Participants discriminated between a natural and an Eidolon-distorted image in a 2IFC task, for different distortion intensities (reach) and spatial frequencies (grain). Images were presented 6.4 degrees to the right of fixation, and subtended 7.5 degrees in diameter. We found that Eidolon distortions were easier to detect (lower reach thresholds) at all grain values in more scene-like images compared to texture-like images. Together, these data indicate that the low-dimensional representation of spatial coherence and contrast energy can provide a placement of images onto a scale ranging from things to stuff, at least in terms of perceptual sensitivity to spatial distortions.