Abstract
We must shift our attention to process the complex information in real-world scenes. How do we determine where to focus our attention in scenes? Image saliency theory proposes that our attention is ‘pulled’ to scene regions that differ in low-level image features (e.g., color, orientation, and/or luminance) from the surrounding regions. However, most image saliency models also produce substantial scene-independent spatial biases to help capture observer center bias. In the present study, we tested whether image saliency models explain scene attention based on scene-dependent image features or simply their scene-independent center bias. Participants (N=65) viewed 40 real-world scenes for 12 seconds while performing a scene memorization task. For each scene, a fixation density map was computed across all participant fixations to summarize scene attention. An image saliency map for each scene was then computed using three of the most cited image saliency models including the Itti & Koch model (Itti, Koch, & Niebur, 1998), the Graph-based Visual Saliency model (GBVS; Harel, Koch, & Perona, 2006), and the Attention Information Maximization model (AIM; Bruce & Tsotsos, 2007). For comparison, semantic feature maps (“meaning maps”) were generated using human ratings of the informativeness of isolated scene patches (Henderson & Hayes, 2017). The average squared correlation (R2) between the scene fixation density maps and each image saliency model and its spatial bias were computed separately. The image saliency models on average explained 52% less variance in scene fixation density than their spatial bias alone (IttiKoch bias=0.45, IttiKoch=0.19; GBVS bias=0.46, GBVS=0.37; AIM bias=0.41, AIM=0.08). In comparison, the meaning maps explained on average 14% more variance than the spatial bias models. These results suggest that during scene memorization salient scene regions are weaker predictors of scene attention than a simple center bias model, whereas scene semantics explain additional variance beyond spatial center bias.
Acknowledgement: Supported by NEI/NIH R01EY027792