Purchase this article with an account.
Matt Anderson, Wendy Adams, Erich Graf, Krista Ehinger, James Elder; Human-Centered Categorization of Natural Scenes. Journal of Vision 2018;18(10):141. doi: https://doi.org/10.1167/18.10.141.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
A scene's categorical identity (e.g. forest or beach) contains useful information, such as the probable identity and location of objects, and the actions that might occur within it. Recent work has provided insights into the computational processes underlying categorization (review: Malcolm, Groen & Baker, 2016). However, most existing category systems are defined by labels selected by small groups of researchers (e.g., Oliva & Torralba, 2001; Fei-Fei & Perona, 2005), or exhaustive vocabularies of place names (e.g., Deng et al., 2009; Xiao, Hays, Ehinger, Oliva, & Torralba, 2010). Here we present a new, psychologically valid method of deriving categories, and report categorization data across three different dimensions for images of natural scenes from the SYNS dataset (Adams et al., 2016). Human observers organised 80 images in a free sorting task. In separate experiments, images were grouped according to (i) semantic content, (ii) 3D spatial structure, or (iii) 2D image appearance. Observers subsequently generated up to 5 text labels to describe each group. Using leave-one-out cross-validation, we determined the most representative category structure for each dimension, and then assigned labels to each category. Inter-observer consistency was highest for semantic categorisation. Our analyses reveal reliable relationships between category dimensions. For example, images in the semantic 'Coast' category were also associated with the 'Flat' 3D spatial structure category, and the 'Blue' 2D appearance category. A Naïve Bayes classifier trained to predict category membership in one dimension from category membership in the other two dimensions performed at around 70% accuracy. This exceeded maximum chance-level performance of 34.17%. These results support scene-centered theories of category representation, which assert that semantic categories can be derived from a limited set of global image properties (e.g., Oliva & Torralba, 2001). The proposed category structures will improve the psychological validity of studies that explore scene categorization.
Meeting abstract presented at VSS 2018
This PDF is available to Subscribers Only