Abstract
We present a Bayesian model for visual salience learning in natural scenes. The model evaluates each point in the scene as belonging to the salient or non-salient class. The computed salience value of the point corresponds to the probability that it belongs to the salient class. From Bayesian probability theory, we derive three components of visual salience. For the situation in which observers are viewing the scene with no explicit goal, these salience modules are Rarity, Distinctiveness and a Central bias. The Rarity module computes a global property, comparing a point to all points in the scene. The Distinctiveness module uses local contrast to compare each point to all other points, but biased toward the local neighborhood. Finally, a Central bias serves as the location prior when no other task is given. A neural network, using a back-propagation algorithm, learns a weighted, nonlinear combination of these modules to generate the final saliency map. We use this model to predict human eye fixations and to segment images. The saliency map used to predict eye fixations is created from multi-scale representations of features and is computed on a pixel by pixel basis. The experimental results on two fixation databases indicate that our method outperforms other representative methods. For image segmentation, the goal is to uniformly highlight the most salient, foreground object and darken the background. In this case, an image is first over-segmented into a number of regions using the mean-shift method. Then salience is computed using single-scale features on a region by region basis. This generates a high-resolution saliency map which uniformly highlights the salient object and well defines the borders of the object. The experimental results on the salience segmentation database indicate that our method shows better figure-ground segmentation than other representative methods.
Meeting abstract presented at VSS 2013