October 2003
Volume 3, Issue 9
Vision Sciences Society Annual Meeting Abstract  |   October 2003
Learning to optimally detect image boundaries using brightness, color and texture
Author Affiliations
  • David R Martin
    Computer Science Division, UC Berkeley, USA
Journal of Vision October 2003, Vol.3, 113. doi:10.1167/3.9.113
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      David R Martin, Charless C Fowlkes, Jitendra Malik; Learning to optimally detect image boundaries using brightness, color and texture. Journal of Vision 2003;3(9):113. doi: 10.1167/3.9.113.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Goal: Psychophysics, e.g. Rivest and Cavanagh (1996), has shown that humans make combined use of multiple cues to detect and localize boundaries in images. We use a dataset of natural images to learn optimum cue combination of local brightness, texture and color, as well as quantify the relative power of these cues. Methods: Cue combination is formulated as supervised learning. A large dataset (∼1000) of natural images, each segmented by multiple human observers (∼10), provides the ground truth label for each pixel as having an oriented boundary element or not. The task is to model the posterior probability of a pixel being at a boundary, at a particular orientation, conditioned on local features derived from brightness, texture and color. Our features are based on computing directional gradients of outputs of V1-like mechanisms. Texture gradients are computed as differences in histograms of oriented filter outputs, and color gradients on histograms of a*, b* features in CIE L*a*b* space. Several types of classifiers ranging from logistic regression to support vector machines were trained. Performance was evaluated on a separate test set using a precision-recall curve which is a variant of the ROC curve. This curve can be summarized by its optimal F-measure, the harmonic mean of precision and recall. Results: (1)The precise form of the classifier does not matter-equally good results were obtained using logistic regression (weighted linear combination of features) as with more complicated classifiers. (2) Singly, brightness, texture and color yield F-measures of 0.62, 0.61, and 0.60 respectively. The optimal gray-scale combination of brightness and texture has an F-measure of 0.65 and addition of color boosts it to 0.67. These results indicate that the different cues are correlated but do carry independent information. By measuring inter-human consistency, the gold standard F-measure is 0.8, thus quantifying the gap left for more global and high-level processing.

Martin, D. R., Fowlkes, C. C., Malik, J.(2003). Learning to optimally detect image boundaries using brightness, color and texture [Abstract]. Journal of Vision, 3( 9): 113, 113a, http://journalofvision.org/3/9/113/, doi:10.1167/3.9.113. [CrossRef]

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.