September 2011
Volume 11, Issue 11
Vision Sciences Society Annual Meeting Abstract  |   September 2011
Can configural relations be encoded by image histograms of higher-order filters?
Author Affiliations
  • Nicholas M. Van Horn
    The Ohio State University, USA
  • Alexander A. Petrov
    The Ohio State University, USA
  • James T. Todd
    The Ohio State University, USA
Journal of Vision September 2011, Vol.11, 852. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Nicholas M. Van Horn, Alexander A. Petrov, James T. Todd; Can configural relations be encoded by image histograms of higher-order filters?. Journal of Vision 2011;11(11):852.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

A popular method for representing images is to compute histograms of pixel intensities, wavelet responses, or the outputs of more complex filters that are tuned to specific shapes (e.g. Riesenhuber & Poggio, 1999). Because these representations do not retain the locations of the activated filters, they may not be well suited for the analysis of configural relations among image features. The present study was designed to address this issue.

Method: We created 4 classes of objects based on the classic Vernier acuity and bisection tasks. Each object was composed of 3 irregularly shaped white dots embedded in a larger irregular black disc. Class membership was determined by whether the dots were arranged collinearly and whether they were equally spaced. Naive observers were trained to classify the stimuli in two separate conditions: One in which they were trained and tested with all possible stimulus orientations, and a second in which they were trained with one set of orientations and then tested with another. We also evaluated these same two conditions using a recent implementation by Mutch & Lowe (2006) of the HMAX model originally developed by Riesenhuber & Poggio (1999).

Results: Most subjects exceeded 85% accuracy in both conditions after ∼180 exposures to each class. Performance was much worse for the HMAX model. When trained with 300 images from each class, its average classification accuracy was only 31%, and was reduced to chance (i.e. 26%) in the transfer condition.

Conclusion: Human observers can easily learn to classify objects based on configural properties such as collinear alignment or bisection, but similar performance cannot be achieved using histograms of higher order features as implemented in the HMAX model. These findings identify an important limitation of representing images with histograms of filter activations without retaining the relative spatial locations of those filters.

Supported by NSF BCS-0962119. 

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.