Abstract
Humans are adept at determining the base-level category of natural scenes (Tversky & Hemenway, 1983). What visual features of an image does an observer use in such categorization tasks? Previous computational studies have established that classification of scenes is possible using power spectral information (i.e., magnitude of spatial frequencies; Oliva & Torralba, 2001) and local texture descriptors (Fei-Fei & Perona, 2005). Here we take a new approach toward identifying possible features that distinguish between categories by comparing good and bad examples of a category. If a particular feature is relevant to human categorization, it should also provide better classification for good than bad examples of that category. Using linear pattern recognition algorithms, we performed multi-way classification on six categories (beaches, city streets, forests, highways, mountains and offices), each comprised of 50 images that were rated by naïve participants as “good” examples of their category, and an additional 50 that were rated as “bad” examples of their category (Torralbo et al., VSS 2009). We found that several feature sets, including the power spectrum, color histogram, and local surface geometry and texture information (Hoiem et al., 2005) resulted in average classification rates significantly above chance-level. More importantly, when these classification results were separated into “good” and “bad” examples, all three feature sets showed greater classification accuracies for “good” than “bad” category exemplars. These results suggest that all three feature sets are viable candidate features that humans could use to distinguish among our natural scenes categories.
This work is funded by the NIH (LFF, DMB, DBW), a Beckman Postdoctoral Fellowship (DBW), a Microsoft Research New Faculty Fellowship (LFF), and the Frank Moss Gift Fund (LFF).