Abstract
Statistical approaches have become a main framework for understanding natural vision in the last two decades. A major unmet challenge of this framework is to develop a statistical model of natural visual scenes and a key element of such a model is the higher-order structures and their statistics in natural scenes. In this work, we developed a probabilistic model of natural scenes where each scene is a sample of a probability distribution in terms of a set of natural scene structures (NSSs) and their spatial arrangements. In this concept, NSSs are multi-size, multi-scale spatial concatenations of local visual features. To compile NSSs from images of natural scenes, we first sampled a large number of multi-size, multi-scale circular patches arranged in a hexagon configuration. We then performed independent component analysis on the patches and obtained a set of clusters of independent components using the K-means method. Finally, for each scene category, we obtained a set of NSSs by clustering the scene patches. To model spatial concatenations of NSSs in natural scenes, we compiled the adjacent matrix for each NSS and obtained the eigenvalues of the adjacency matrix. We examined the statistics and the information content of NSSs. After selecting a set of informative NSSs for each scene category, we used the occurring frequencies and the spatial information of NSSs as features to classify natural scenes in two widely used datasets. We obtained several results. First, NSSs carry a variety of amount of information of natural scenes, including smooth luminance patterns, textures, and patterns with edges and junctions. Second, high accuracy in scene classification using NSSs as features can be achieved. Third, spatial concatenations of NSSs can encode spatial information in natural scenes. We thus conclude that this model of natural scenes is a plausible candidate for human scene perception.
Meeting abstract presented at VSS 2013