Purchase this article with an account.
or
Aude Oliva, Antonio Torralba, Monica S Castelhano, John M Henderson; Top-down control of visual attention in real world scenes. Journal of Vision 2003;3(9):3. https://doi.org/10.1167/3.9.3.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
During the first glance at a complex scene, attention of the observer is driven towards a particular region in the scene and the first saccade is programmed. Studies in scene recognition have acknowledged that significant structural and spatial layout information is extracted within a glance to form a semantic “gist” of the scene. Therefore, information included in the gist forms a visual context that is likely to modulate in a top-down manner where attention will land in a complex scene image. In this presentation, we extend a computational model of the gist that encodes the coarse spectral layout of a scene image to incorporate attentional guidance mechanisms and generate eye movements. The model uses the statistical correlations that exist between global scene structure (e.g. a street scene is in perspective) and object properties (e.g. location of pedestrian) to define a region of interest in the image that is relevant for solving a task (e.g., looking for people). Eye movements of 8 human observers were monitored, while instructed to search for a specific object (people) in 36 real world scenes. The region of interest scrutinized by observers and determined by the gist guidance schema overlap in more than 85% of the cases. Multiple fixation points (e.g. saccades) within the region of interest were generated by integrating a bottom-up saliency model with the top-down attentional guidance mechanism. Using a set of similarity metrics, we show that the locations of the multiple fixations of attention generated by the integrative model and by human observers were very similar. The results validate the proposition that top-down information from visual context modulates early the saliency of image regions during the task of object detection.
This PDF is available to Subscribers Only