Abstract
Saliency algorithms predominantly utilize a winner-take-all (WTA) approach to determine fixation targets. However, WTA methods are vulnerable to spike noise, and thus most algorithms will employ a smoothing step over the initial saliency map in order to improve the signal-to-noise ratio (SNR). Although it is common practice to plot the performance of a given saliency algorithm as it varies with the size of this smoothing kernel in order to find the overall optimal degree of smoothing (see Hou et al. 2012 for an example), we show that there remains variability in the optimal degree of smoothing between individual images. We propose an algorithm for projecting a saliency map across multiple spatial scales and dynamically determining the most appropriate scale from which to select a fixation target. This not only provides a mechanism for improving the performance of most saliency algorithms, but also links saliency research to a number of important psychophysical results in the area of visual search. As Wolfe (1998) has previously described, there are a large number of dimensions underlying efficient visual search beyond those ordinarily covered by saliency algorithms (which typically focus on orientation and colour); our algorithm explicitly allows for spatial scale to be incorporated into the salience calculation. Likewise, Najemnik and Geisler (2005), demonstrate that an optimal observer in a visual search task executes fixations based both on WTA strategies and what they label Center-of-Gravity (CoG) strategies. By identifying spatial scales in which a number of localized saliency peaks may be smoothed together to form a more unique centralized peak, we provide an implicit mechanism for generating CoG style fixations.
Meeting abstract presented at VSS 2014