December 2001
Volume 1, Issue 3
Vision Sciences Society Annual Meeting Abstract  |   December 2001
Unsupervised learning of object classes from natural scenes
Author Affiliations
  • Pietro Perona
    California Institute of Technology, Pasadena, CA, USA
  • Markus Weber
    California Institute of Technology, Pasadena, CA, USA
  • Max Welling
    University College London, Gatsby Unit, UK
Journal of Vision December 2001, Vol.1, 419. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Pietro Perona, Markus Weber, Max Welling; Unsupervised learning of object classes from natural scenes. Journal of Vision 2001;1(3):419. doi:

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Recognizing objects in images is one of the most important functions of our visual system. We can recognize not only individual objects, such as the Eiffel Tower or our grandmother's face, but also categories of objects, such as people, shoes, automobiles. Considerable attention has been devoted to formulating models and algorithms that may explain visual recognition and classification however, no theory describes yet how these models may be trained automatically in realistic conditions: Can a child, or a machine, learn to recognize ‘faces’ and ‘cars’ only by looking? This is at best a difficult task: natural scenes are cluttered and may not contain explicit information on the presence, location and structure of new objects, even if such objects are plentiful. We present a computational theory of how object models may be learned from images of such scenes. We model object categories probabilistically, as collections of parts that appear in a characteristic spatial arrangement. We demonstrate that it is possible to train successfully such models on unsegmented cluttered images without supervision, and that object categories are an emergent property of the learning process. Our method is based on maximizing the likelihood of the model with respect to the training data in two steps: first, features that appear often in the environment are selected as probable object parts; second, constellations of such parts that tend to appear in a consistent mutual position are selected. The probabilistic description of such constellations is the model for an object class.

Perona, P., Weber, M., Welling, M.(2001). Unsupervised learning of object classes from natural scenes [Abstract]. Journal of Vision, 1( 3): 419, 419a,, doi:10.1167/1.3.419. [CrossRef]

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.