Abstract
People have a remarkable ability to attend selectively to one or a few sensory inputs while ignoring the other ones (J. Moran, & R. Desimone, 1985). Although many studies have been made on selective attention at object level (T. Brosch, G. Pourtois, & D. Sander, 2010), less is known about the human attention and visual perception on complex natural scenes. We studied the correlation between human attention and visual perception of images of natural scenes. Sixteen subjects (mean age = 27) freely viewed 1249 emotional images and had their eye movements recorded. Another group of 358 participants viewed the same set of images and annotated a comprehensive list of 33 scene-level attributes, which includes 10 emotions (happiness, surprise, awe, excitement, amusement, contentment, sadness, anger, fear, and disgust) and 23 other attributes commonly studied in computer science community (e.g., aesthetics, image quality). Analyses indicated the following relationships among observer fixation patterns and image attributes: (1) Human has generally longer fixation duration and saccade duration on images with positive sentiments (e.g., aesthetics, awe), but shorter fixation on images with negative sentiments (e.g., sad, disgust). (2) Human has a general shorter saccade length and lower saccade velocity on images that are more centered and symmetric. (3) When an image is of high quality or having focused object, it will usually lead to shorter saccade duration and higher saccade velocity of its observer. In summary, we found that image semantics, sentiment, and spatial layout correlate with human fixation patterns in a significant way. Our method is general and comprehensive in the sense that it focuses on complex natural scenes and studies on an intensive attribute lists.
Meeting abstract presented at VSS 2017