The bottom-up saliency assumption is based on the hypothesis that certain features of the visual scene inherently attract gaze; that is, that vision is essentially reactive and stimulus driven. This view is in part based on psychophysical studies (Krieger, Rentschler, Hauske, Schill, & Zetzsche,
2000; Mannan, Ruddock, & Wooding,
1996; Reinagel & Zador,
1999; Tatler, Baddeley, & Gilchrist,
2005) in which differences in image properties were observed between fixated and randomly chosen locations. The initial fixations when viewing a large number of two-dimensional photographic images are recorded and statistics for several features are compared between the gaze locations chosen by the subjects with those of randomly chosen parts of the scene. Such studies have produced mixed results regarding which features are significantly different at fixation locations. Whereas some researches found that luminance contrast was elevated at the point of gaze (Parkhurst & Niebur,
2003), other studies found that the edge density was significantly stronger at fixation locations (Baddeley & Tatler,
2006; Mannan et al.,
1996). Subsequent models have been proposed that relate the bottom-up assumption to neuronal processing in cortical visual areas. Starting from the notion that “early” visual areas represent low level features such as oriented edges, such models have extracted analogous features from images and proposed methods by which a scalar saliency map could be calculated (Itti, Koch, & Niebur,
1998; Koch & Ullman,
1985). These methods apply different forms of center-surround competitive algorithms in order to find regions of across-scale contrast within single feature dimensions and proceed to combine multiple maps to a single saliency map (Itti,
2000) by some weighting technique. Such feature saliency-based models have a large number of free parameters, which have to be adjusted in order to obtain meaningful saliency maps. It is necessary to choose the number of filters, their respective parameters such as orientations and spatial frequencies, as well as the spatial scales, the normalization functions, the summation rules, and the parameters of the network implementing the spatial competition within the saliency map.