Abstract
The ability to categorize visual scenes rapidly and accurately is highly constructive for both biological and machine vision. Following the seminal demonstrations of the ability of humans to recognize scenes in a fraction of a second (e.g.,Potter and Levi,1969; Biederman, 1972), much research has been devoted to understanding its underlying visual process (e.g., Thorpe etal., 1996; Oliva and Torralba, 2001; Loschky and Larson, 2008, 2010), as well as its computational modeling (e.g., FeiFei and Perona, 2005; Lazebnik etal., 2006; Xiao etal., 2010). In this work we focus on one aspect of the scene categorization process and investigate whether prior knowledge about the perceptual relations between the different scene categories may help facilitate better, more efficient, and faster scene categorization. We first introduce a psychophysical paradigm that probes human scene categorization, and extracts perceptual relations between scene categories. Then, we show that these perceptual relations do not always conform the semantic structure between categories. Finally, we incorporate the obtained perceptual relations into a computational classification scheme, which takes inter-class relationships into account to obtain better scene categorization, particularly when supervised categories are under-sampled. We argue that prior knowledge of such relationships could partly explain the fact that humans are often able to learn and process scene categories from very few training examples, while computational models usually need at least tens of training examples per-category before achieving reasonable categorization performance.
Meeting abstract presented at VSS 2012