Purchase this article with an account.
Lior Elazary, Itti Laurent; Interesting objects in natural scenes are more salient. Journal of Vision 2007;7(9):947. doi: 10.1167/7.9.947.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
How do we decide which objects in a visual scene are more interesting? Intuition suggests a complex process of recognizing different candidate scene elements in turn, evaluating their identity and other attributes against behavioral preferences and goals, and finally deciding which among the candidates are more relevant and interesting. Here we investigate the contributions of a much simpler process, saliency-based visual attention. We used the publicly available LabelMe database of 24,863 digital photographs in which 74,454 presumably interesting objects have been manually outlined. We evaluated how often these objects were among the few most salient locations by a computational model of bottom-up attention. We find that in 43% of all images the model's first fixation falls within a labeled region, twice above chance (21%). Furthermore, within three fixations, the saliency map is able to pick a labeled region over 85% of the time, with performance leveling off after six fixations. The bottom-up attention model has no notion of object nor of semantic relevance. Hence, our results indicate that selecting interesting objects in a scene is largely constrained by low-level visual properties of scene elements, rather than solely determined by recognition and higher cognitive processes. The saliency map is a strong predictor of what humans find interesting in complex natural scenes.
This PDF is available to Subscribers Only