Purchase this article with an account.
Alex Hwang, Marc Pomplun; A model of top-down control of attention during visual search in real-world scenes. Journal of Vision 2008;8(6):681. doi: https://doi.org/10.1167/8.6.681.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Recently, there has been great interest among vision researchers in devising computational models that predict the distribution of saccadic endpoints in naturalistic scenes (e.g., Itti & Koch, Vis. Res. 2000, Bruce & Tsotsos, NIPS 2006). In these studies, subjects are instructed to view the scenes without any particular task in mind so that stimulus-driven (bottom-up) processes guide visual attention. However, whenever there is a task, additional goal-driven (top-down) processes play an important - and most often dominant - role. Pomplun (Vis. Res. 2006) showed that during visual search in real-world scenes, attention is systematically biased towards image features that resemble those of the search target. Therefore, in order to understand and predict attentional selection in real-world scenes, we need to have a computational model of top-down attentional control in addition to existing bottom-up models. In the present study, we devised such a top-down model based on three basic principles: First, visual similarity between the search target and local image portions for several stimulus dimensions is defined using a histogram-matching technique. Second, the informativeness of these dimensions for a given search display is computed as an entropy-related function of the target-similarity “landscape”. Third, as suggested by previous studies (Pomplun, 2006; Shen, Reingold & Pomplun, Percept. 2000), more informative dimensions are assumed to have a greater influence on attentional selection in visual search. The relative importance of each stimulus dimension and its dependence on informativeness is obtained from empirical eye-movement data. We tested the model by having it predict the distribution of saccadic endpoints in another experiment using real-world search displays. The predicted distributions revealed a strong similarity to the empirically observed ones, indicating that the model identifies the most important factors contributing to top-down attentional control in visual search. This project was supported by Grant Number R15EY017988 from the National Eye Institute.
This PDF is available to Subscribers Only