May 2008
Volume 8, Issue 6
Vision Sciences Society Annual Meeting Abstract  |   May 2008
A bayesian model of visual search and recognition
Author Affiliations
  • Lior Elazary
    Computer Science, University of Southern California
  • Laurent Itti
    Computer Science, University of Southern California, and Neuroscience, University of Southern California
Journal of Vision May 2008, Vol.8, 841. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Lior Elazary, Laurent Itti; A bayesian model of visual search and recognition. Journal of Vision 2008;8(6):841.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Visual search and recognition in humans employ a combination of bottom-up (data-driven) and top-down (goal-driven) processes. Although many bottom-up search and recognition models have been developed, the computational and neural basis of top-down biasing in such models has remained elusive. This paper develops a new model of attention guidance with dual emphasis: a single common Bayesian representational framework is used (1) for learning how to bias and guide search towards desired targets, and, (2) for recognizing targets when they are found. At its core, the model learns probability distributions of an object's visual appearance having a range of values along a number of low-level visual feature dimensions, then uses this learned knowledge both to locate and to recognize desired objects. The model is tested on three publicly available datasets, ALOI, COIL and SOIL47, containing photographs of 1,000, 100 and 47 objects taken under many viewpoints and illuminations (117,174 images in total). Model performance for recognition is compared to that of two state-of-the-art object recognition models (SIFT and HMAX). The proposed model performs significantly better and faster, reaching 89% classification rate (SIFT: 25%, HMAX: 76%) when utilizing 1/4 of the images for training and 3/4 for testing, while at the same time being 89 and 279 times faster than SIFT and HMAX, respectively. The proposed model can also be used for top-down guided search, finding a desired object in a 5x5 search array on average within 4 attempts (chance would be 12.5 attempts). Our results suggest that the simple Bayesian formalism developed here is capable of delivering robust machine vision performance.

Elazary, L. Itti, L. (2008). A bayesian model of visual search and recognition [Abstract]. Journal of Vision, 8(6):841, 841a,, doi:10.1167/8.6.841. [CrossRef]
 This work was supported by HFSP, NSF, DARPA, and NGA  his work was supported by HFSP, NSF, DARPA, and NGA.

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.