Abstract
Scene recognition is a fundamental problem in visual perception and yet scene understanding research has been limited by the lack of proper knowledge of visual scene ontology, and by the absence of meaningful scene representation. While recently we proposed a new experimental paradigm for defining and determining perceptual relationships between scene categories (Kadar & Ben-Shahar, 2012), here we introduce "SceneNet" - a new and comprehensive ontology database of scene categories derived directly from a large-scale human vision study that organizes scene categories according to their perceptual relationships. This ontology database suggests that perceptual relationships do not always conform to the semantic structure between categories, and provides a lower dimensional perceptual space with "perceptually meaningful" Euclidean distance, with each embedded scene category being represented by a single prototype. We also incorporate the SceneNet ontology into a computational scheme for learning non-linear mapping of scene images into the perceptual space, where each scene image is closest to its category prototype than to any other prototype by a large margin. In addition to much better computational results on various large scale scene understanding operations, the SceneNet database provides important insights into human scene representation and organization and may serve as a key element in better understanding of this important perceptual capacity.
Meeting abstract presented at VSS 2013