Free
Article  |   December 2013
Computing local edge probability in natural scenes from a population of oriented simple cells
Author Affiliations
Journal of Vision December 2013, Vol.13, 19. doi:10.1167/13.14.19
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Chaithanya A. Ramachandra, Bartlett W. Mel; Computing local edge probability in natural scenes from a population of oriented simple cells. Journal of Vision 2013;13(14):19. doi: 10.1167/13.14.19.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract
Abstract
Abstract:

Abstract  A key computation in visual cortex is the extraction of object contours, where the first stage of processing is commonly attributed to V1 simple cells. The standard model of a simple cell—an oriented linear filter followed by a divisive normalization—fits a wide variety of physiological data, but is a poor performing local edge detector when applied to natural images. The brain's ability to finely discriminate edges from nonedges therefore likely depends on information encoded by local simple cell populations. To gain insight into the corresponding decoding problem, we used Bayes's rule to calculate edge probability at a given location/orientation in an image based on a surrounding filter population. Beginning with a set of ∼ 100 filters, we culled out a subset that were maximally informative about edges, and minimally correlated to allow factorization of the joint on- and off-edge likelihood functions. Key features of our approach include a new, efficient method for ground-truth edge labeling, an emphasis on achieving filter independence, including a focus on filters in the region orthogonal rather than tangential to an edge, and the use of a customized parametric model to represent the individual filter likelihood functions. The resulting population-based edge detector has zero parameters, calculates edge probability based on a sum of surrounding filter influences, is much more sharply tuned than the underlying linear filters, and effectively captures fine-scale edge structure in natural scenes. Our findings predict nonmonotonic interactions between cells in visual cortex, wherein a cell may for certain stimuli excite and for other stimuli inhibit the same neighboring cell, depending on the two cells' relative offsets in position and orientation, and their relative activation levels.

Introduction
Detecting object contours is a critical function of biological visual systems, and a key step on the path to object recognition (Biederman, 1987; Biederman & Ju, 1988; DeCarlo, 2008; Kourtzi & Kanwisher, 2001; Lowe, 1999; Marr, 1983; see Papari & Petkov, 2011, for review of computer vision approaches to edge/contour detection). Our understanding of the contour detection problem is far from complete, however, and algorithms capable of reliably detecting object contours in natural scenes remain elusive. In the visual cortex, the first stage of contour processing is thought to occur in area V1, where “simple cells” (the first stage in Hubel & Wiesel's 1962 classical hierarchical model) are tuned to the position and orientation of local edges. Though the biophysical mechanisms underlying simple cell responses are complex (Priebe & Ferster, 2012), a simple cell's output can be modeled under a wide range of experimental conditions as an oriented linear filter followed by a divisive normalization operation (Carandini & Heeger, 2012). Cells modeled in this way exhibit poor position and orientation tuning compared to real simple cells with the same aspect ratio, however (Gardner, Anzai, Ohzawa, & Freeman, 1999), and suffer from selectivity problems when applied as edge detectors to natural images, suggesting that simple cells in V1 benefit from additional nonlinear processing that improves their ability to localize edges, and to discriminate edges from nonedges across wide ranging contrast levels. Raising thresholds or applying an expansive output nonlinearity (Heeger, 1992) can sharpen tuning curves to an arbitrary degree, but is an ineffective strategy from an edge-detection perspective in that the underlying linear filtering operation is fundamentally unable to distinguish properly aligned low contrast edges from misaligned high contrast ones, or from contrasty nonedge structures, and this intrinsic weakness of the detector cannot not remedied by thresholding. 
More sophisticated edge/contour detection algorithms that retain some biological inspiration have typically exploited the Gestalt principle of “good continuation” or related principles to improve detection performance (Choe & Miikkulainen, 1998; Elder & Zucker, 1998; Grossberg & Williamson, 2001; Guy & Medioni, 1992; Z. Li, 1998; Parent & Zucker, 1989; Ross, Grossberg, & Mingolla, 2000; Sha'asua & Ullman, 1988; VanRullen, Delorme, & Thorpe, 2001; Williams & Jacobs, 1997; Yen & Finkel, 1998). That our visual systems are highly sensitive to continuous contours is supported by numerous psychophysical (Adini, Sagi, & Tsodyks, 1997; Dresp, 1993; Field, Hayes, & Hess, 1993; Geisler, Perry, Super, & Gallogly, 2001; Kapadia, Ito, Gilbert, & Westheimer, 1995; W. Li & Gilbert, 2002; Polat & Sagi, 1993) and neurophysiological (Bauer & Heinze, 2002; Kapadia et al., 1995; Kapadia, Westheimer, & Gilbert, 2000; Kourtzi & Huberle, 2005; Kourtzi, Tolias, Altmann, Augath, & Logothetis, 2003; Polat, Mizobe, Pettet, Kasamatsu, & Norcia, 1998) studies on humans and monkeys. Furthermore, that our visual systems should be sensitive to contour continuity is supported by edge co-occurrence statistics in natural scenes (Geisler et al., 2001; Sigman, Cecchi, Gilbert, & Magnasco, 2001).Virtually all of these previous studies concur that the key measurements needed for contour extraction lie in a butterfly-shaped “association field” centered on a reference edge that reflects contour continuity principles (Field et al., 1993), with an inhibitory region orthogonal to the edge (Figure 1; Geisler et al., 2001; Kapadia et al., 2000; Z. Li, 1998) that presumably reflects the tendency for only a single object contour at a time to pass through any given point in the image. 
Figure 1
 
Tangential versus orthogonal regions surrounding a candidate edge. Oriented filters in the vicinity of a reference location (marked by a red rectangle) can be loosely classified into two groups—those in the orthogonal region (in blue) and those in the tangential region (in green). Tangential edges are particularly subject to higher order correlations. For example, given an edge at the reference location, evidence for edges at Locations A and C is positive evidence for edges at B and D, but negative evidence for edges at E and F.
Figure 1
 
Tangential versus orthogonal regions surrounding a candidate edge. Oriented filters in the vicinity of a reference location (marked by a red rectangle) can be loosely classified into two groups—those in the orthogonal region (in blue) and those in the tangential region (in green). Tangential edges are particularly subject to higher order correlations. For example, given an edge at the reference location, evidence for edges at Locations A and C is positive evidence for edges at B and D, but negative evidence for edges at E and F.
Identifying the set of image measurements that is most useful for contour extraction is no doubt a crucial step, but leaves open the question as to how those measurements should be algorithmically combined to detect contours in natural images. First-principles theoretical models of edge/contour structure can be used to develop edge detection algorithms (see Papari & Petkov, 2011), but face a multitude of modeling challenges, including the multiscale structure of natural object boundaries, lighting inhomogeneities, partial occlusions, disappearing local contrast, and optical effects such as blur from limited depth of field. As an alternative, the effects of all of these real-world complexities, and others known and unknown, can in principle be “learned” from positive and negative examples of natural edges, but this opens up a vast space of engineering choices, including: what type of classifier should be adopted, what filter values should be used as features, and what training data should be collected (Dollar, Tu, & Belongie, 2006; Konishi, Yuille, Coughlan, & Zhu, 2003; D. R. Martin, Fowlkes, & Malik, 2004). 
Fundamentally, the way that a population of filter responses r1, r2, …rN should be combined to calculate the probability that an edge exists at a reference location and orientation follows from Bayes's rule. Bayesian inference has had significant successes in explaining behavior in sensory and motor tasks (Fiser, Berkes, Orbán, & Lengyel, 2010; Kording & Wolpert, 2004; Tenenbaum, Kemp, Griffiths, & Goodman, 2011; Weiss, Simoncelli, & Adelson, 2002; A. Yuille & Kersten, 2006; A. L. Yuille & Grzywacz, 1988). However, in the context of edge detection within a V1-like architecture, given that there are thousands of oriented filters within a small distance of a candidate edge, the need for human labeled ground truth data makes it intractable to fully populate the high-dimensional joint filter likelihood functions required to evaluate Bayes's rule. 
Konishi et al. (2003) dealt with this curse of dimensionality by limiting their analysis to small sets of off-the-shelf edge filters centered on a candidate edge (up to six filters at a time), and used an adaptive binning method to efficiently tabulate the on- and off-edge likelihood functions from preexisting human-labeled edge databases. Their approach led to significantly improved edge detection performance compared to single-feature edge classifiers, but did not address the issue as to whether, or how, the edge probability calculation could be implemented by cell-cell interactions in a biological context—one of the major goals of this work. 
With this question of biological implementation in mind, we took an alternative approach that depended on class conditional independence (CCI) within the chosen filter set (that is, independence of the filter responses both when an edge is present and when one is absent). If/when the CCI assumption is satisfied (see Jacobs, 1995, for review), the on- and off-edge likelihood functions can be factored into products of single-filter likelihood functions, and then rewritten in terms of a sum of log-likelihood (LL) ratio terms (Equations 13). 
The ability to factor the filter likelihood functions leads to three advantages: (a) the requirements for human-labeled data are reduced from the order of xN to x × N, where N is the number of participating filters and x is the number of gradations in each filter output; (b) each LL ratio term can be expressed and visualized as a function of a single filter value ri, making explicit the information that a filter at one location in an image carries about the presence of an edge at another; and (c) in the overall edge probability calculation, the positive and negative evidence from surrounding filters that is captured by these LL ratios can be combined linearly, a simple, neurally plausible computation. (As a caveat, the LL ratios themselves are generally nonlinear functions of the filter values, complicating the neural interpretation; see the Discussion). 
The series of steps taken to collect the needed ground truth edge data, to identify a set of informative CCI filters surrounding a candidate edge, and to parametrically represent their LL ratios, are described in the following. The performance of the resulting parameter-free edge detector is then evaluated, and the biological significance of the underlying population-based computation is discussed. 
Methods
Bayesian cue combination
Following Jacobs (1995), the cue combination problem in the context of edge detection, given filter values r1, r2, …rN may be expressed in probabilistic terms via Bayes's rule,  and then rewritten to make explicit the prior and likelihood ratios in the denominator:     
Under the assumption of class-conditional independence among the N filters, the likelihoods in Equation 3 can be factored and rewritten in terms of a sum of log-likelihood ratio terms, each one a function of a single filter's value; the sum then acts as the argument to a sigmoid function Display FormulaImage not available :   
A modified version of Bayes's rule conditioned on the value of a reference filter rref, evaluated at the location/orientation where the edge probability is being calculated, is used in the Results section to reduce higher order statistical dependencies among the other contributing filters (see text for details and references):   
Image database and extraction of the luminance channel
RGB images were converted to the following three independent components using the method of Hyvarinen (1999), trained on a random sample of 1.5 million pixels from the Berkeley Segmentation database (D. R. Martin et al., 2004):   
The components O1, O2, O3 roughly corresponded to red-green, blue-yellow, and luminance channels, respectively. In this paper we used O3, the luminance channel only. 
Figure 2
 
The linear filter, its statistics, and its use in ground truth labeling. (A) Oriented linear filter kernel. Convolution results were rectified at zero to obtain the filter response ri. The pixel that denotes the location of the filter is marked by red shading. (B) The log pdf of filter responses measured at all locations and orientations in the database. (C) Example image patches at three linear responses levels measured at the reference location (red rectangle). (D) Probability of an edge for a given linear response (red data points). Fit to data (solid curve) is a sigmoid = 1/(1 + −s(xt)); = 9.9, = 0.3804.
Figure 2
 
The linear filter, its statistics, and its use in ground truth labeling. (A) Oriented linear filter kernel. Convolution results were rectified at zero to obtain the filter response ri. The pixel that denotes the location of the filter is marked by red shading. (B) The log pdf of filter responses measured at all locations and orientations in the database. (C) Example image patches at three linear responses levels measured at the reference location (red rectangle). (D) Probability of an edge for a given linear response (red data points). Fit to data (solid curve) is a sigmoid = 1/(1 + −s(xt)); = 9.9, = 0.3804.
Chernoff information
We used Chernoff information as a measure of distance between on-edge and off-edge likelihood distributions for a given filter (Konishi et al., 2003):  where Pon = P(r|edge), Poff = P(r| Display FormulaImage not available ), the yj are the filter response bins (50 bins in the range [0,1]), and λ was set to 0.5 (Konishi et al., 2003). 
Poisson kernel smoothing
To construct the on-edge and off-edge response distributions, we used kernel density estimation, where each instance of a filter's value was spread along the x axis of its likelihood function with a Poisson “kernel.” This was equivalent to considering a filter response to be the average firing rate of a noisy neuron, and then repeatedly measuring the virtual neuron's actual spike count over a short time window, and histogramming the results. Greater smoothing was achieved by using a shorter time window for spike counting. Let ri be the measured filter value (ranging from zero to one), fmax the virtual noisy neuron's maximum firing rate, τ time window for measuring spikes, and λ = ri × fmax × τ the expected spike count within the time window. The Poisson distribution P(k) = (λk/k!)eλ then gives the probability of reading out k spikes for that filter value, which is then accumulated in a histogram indexed by k. After all of a filter's measured values are processed in this way, the histogram is normalized both horizontally (from zero to one) and vertically (to convert it to a probability density). Example likelihood functions processed in this way are shown in Figure 3A using fmax = 100 Hz, and τ = 500 ms. 
Figure 3
 
Modeling likelihood functions of neighboring filters. (A) Distribution of filter responses taken at the same center but rotated 45° relative to the reference filterfor rref = 0.3 (upper panel) and rref = 0.5 (lower panel). Filter responses including rref are normalized to the range [0, 1].Red curves are for when an edge was judged to be present at the reference location, blue curves for when an edge was judged to be absent. Each panel shows Poisson-smoothed data (thin curves) and parametric fits (thick curves). (B) Plots of the five parameters used to fit the Poisson-smoothed likelihoods as a function of reference filter contrast for a different filter, depicted in the inset of Panel C. Off-edge case had only first four data points, given that only very rarely does an image patch contain no edge when rref = 0.9. (C) Examples of on-edge likelihood functions generated from the parametric model at a range of reference filter values, with the Poisson-smoothed data shown superimposed in thin black lines for the five cases for which labeled data was actually collected (red curves). Green dashed lines are on-edge likelihood functions generated from the parametric model at intermediate, unlabeled reference filter values. Generalization to new data was good: green solid line shows Poisson-smoothed data collected at rref = 0.2, which was not part of the training set (quality of fit to model prediction: r2 = 0.99).
Figure 3
 
Modeling likelihood functions of neighboring filters. (A) Distribution of filter responses taken at the same center but rotated 45° relative to the reference filterfor rref = 0.3 (upper panel) and rref = 0.5 (lower panel). Filter responses including rref are normalized to the range [0, 1].Red curves are for when an edge was judged to be present at the reference location, blue curves for when an edge was judged to be absent. Each panel shows Poisson-smoothed data (thin curves) and parametric fits (thick curves). (B) Plots of the five parameters used to fit the Poisson-smoothed likelihoods as a function of reference filter contrast for a different filter, depicted in the inset of Panel C. Off-edge case had only first four data points, given that only very rarely does an image patch contain no edge when rref = 0.9. (C) Examples of on-edge likelihood functions generated from the parametric model at a range of reference filter values, with the Poisson-smoothed data shown superimposed in thin black lines for the five cases for which labeled data was actually collected (red curves). Green dashed lines are on-edge likelihood functions generated from the parametric model at intermediate, unlabeled reference filter values. Generalization to new data was good: green solid line shows Poisson-smoothed data collected at rref = 0.2, which was not part of the training set (quality of fit to model prediction: r2 = 0.99).
Results
In the following sections, we describe the procedure used to collect and label image patches and to analyze filter statistics, so that a set of informative, CCI filters in could be identified, and the terms of Bayes's rule as expressed in Equation 6 evaluated. We then illustrate the use of the local edge probability calculation as an edge detector, and evaluate its performance on natural images. 
Image set and linear filter kernel
We used 450 images from the COREL database including a variety of indoor and outdoor scenes. Only the luminance channel was used (see Methods). Luminance images were convolved at a single scale with an oriented spatial difference operator as shown in Figure 2A. The 5 × 2 pixel filter was applied at 16 orientations in even steps of 22.5°. The center of rotation was the center of the shaded pixel. Filter responses were rectified at zero and normalized to lie in the range ri [0,1]. When the filter was applied at noncardinal orientations, pixel values off the grid were determined by bilinear interpolation. The probability distribution function (pdf) of the filter's response at all locations and orientations in the database is shown in Figure 1B. The filter had a mean response of 0.012 (out of a maximum of one), and a roughly exponential fall off over most of the range so that, for example, the probability of measuring a filter value near 0.6 was 100,000 times lower than the probability of measuring a value near zero. 
Calculating edge prior probability
As a precursor to computing local edge probability (LEP) based on a population of filters, we first measured (a) the prior edge probability P(edge) at a randomly chosen location, and (b) the posterior edge probability based on the reference filter value alone P(edge |rref). To compute the edge prior, 1,000 image patches were drawn at random from the database, and a randomly oriented reference location was marked by a red box corresponding to the 5 × 2 pixel filter profile shown in Figure 1A. Human labelers were asked to judge whether an edge was present in each image that (a) spanned the length of the red box (i.e., entered and exited through opposite ends of the box; (b) remained entirely within the box; and (c) was unoccluded at the center of the box adjacent to the shaded pixel. Labelers were instructed to score edges as shown in Table 1
Table 1
 
Labeling system used to score edges at the reference location, with the corresponding interpretation and assigned edge probability.
Table 1
 
Labeling system used to score edges at the reference location, with the corresponding interpretation and assigned edge probability.
Score given Interpretation Assigned edge probability
1 Certainly no edge 0
2 Probably no edge 0.25
3 Can't tell—around 50/50 0.5
4 Probably an edge 0.75
5 Certainly an edge 1
The assigned edge probabilities were averaged over all image patches and labelers (total of 1,000 labels), yielding an estimate P(edge) = 1.95% ± 0.3%. 
Edge probability based on a single filter at the reference location
Using a similar method, we measured edge probability at the reference location conditioned on rref, the filter value computed at the reference location itself (i.e., in the red box). Image patches were again drawn at random from the database, this time collected in narrow bins centered at five values of rref = {0.1, 0.3, 0.5, 0.7, 0.9}. Bin width was 0.02. Image patches with rref values outside the bin ranges were discarded. The collection process continued until each bin contained 500 exemplars. Example patches are shown in Figure 2C for three of the five values of rref, showing the clear tendency towards higher edge probability as the value of rref increased. Using the same labeling scheme as above, edges were scored and scores were averaged within each bin. The result is plotted in Figure 1D (red data points) along with a sigmoidal fit (black solid curve). 
Using a population of surrounding filters
We next considered the general case in which multiple filters surrounding a reference location would be used, in addition to rref, to calculate the edge probability at the reference location (Equation 3). Multiple strategies, described in the following, were used to narrow down the large population of filters surrounding a reference location to a subset that is as CCI as possible. As it was also our goal to include only the most informative filters in the chosen filter set, but we wished to avoid measuring the informativeness of large numbers of filters that would later be rejected based their failure to meet the CCI criteria, the steps taken to minimize filter dependencies and maximize filter informativeness were interleaved so as to reduce overall computational effort. 
Independence Strategy 1: Avoiding higher order structural dependencies
A large set of filters lie in the tangential regions flanking a candidate edge (Figure 1), and these filters are known both to contain contour-related information (Geisler et al., 2001; Sigman et al., 2001) and to be used by our visual system to help detect contours (Field et al., 1993; Kapadia et al., 1995). However, geometric constraints associated with real-world object contours introduce strong higher order dependencies between tangential filters. The effect is illustrated in Figure 1: When an edge is known to be present at the reference location, the additional knowledge that edge elements are present at locations a and c acts as evidence for the existence of edges at locations b and d, but against the existence of edges at locations e and f. Given this type of statistical dependency, we excluded tangential filters from consideration, and focused instead on filters within the “orthogonal” regions to either side of the reference location (Figure 1). Filter responses in these regions are largely determined by the surfaces and textures that help define a contour, and are less inter-predictable. For computational tractability, we limited our data collection to the single line of filters cutting perpendicularly through the center of the reference box, for a total of 7 Pixel Positions × 16 Orientations = 112 Total Filter Candidates (see blue lines in Figure 1). 
Independence Strategy 2: Suppressing higher order region-based correlations
It is known that neighboring filter responses in natural images exhibit higher order correlations that stem from the fact that nearby points in the world are often part of the same texture and/or subject to the same illumination or contrast conditions; all of these factors affect average regional filter “power.” These regional power effects induce a particular type of higher order dependency between nearby filters, in which a strong response in one filter predicts a higher response variance in other filters (Karklin & Lewicki, 2003; Parra, Spence, & Sajda, 2001; Schwartz & Simoncelli, 2001; Zetzsche & Röhrbein, 2001). It has been previously pointed out that such dependencies can be suppressed through divisive normalization (Carandini & Heeger, 2012; Karklin & Lewicki, 2003, 2005; Liang, Simoncelli, & Lei, 2000; Parra et al., 2001; Schwartz & Simoncelli, 2001; Wainwright & Simoncelli, 2000; Zetzsche & Röhrbein, 2001; Zhou & Mel, 2008). Adopting a different but related strategy (with secondary benefits as discussed below), we tabulated the1-D likelihood distributions for each candidate filter conditioned both on the edge/no edge distinction, and on the value of rref, in order to obtain the likelihood functions P(ri|edge, rref = C) and P(ri|no Display FormulaImage not available , rref = C) (Figure 3A). Fixing the contrast at the reference location served a similar decorrelating function among surrounding filters as would a divisive normalization, in the sense that image patches within any given rref = C bin exhibit far less variation in regional power than image patches in general (data not shown). 
Given that C took on the same five values as were used previously to measure P( Display FormulaImage not available | rref), all the image patches needed to construct the likelihood functions for the 112 filter candidates had already been collected and labeled. 
A secondary advantage of “slicing up” and separately collecting the likelihood functions at a range of rref values, beyond its effect of decorrelating surrounding filters, is that the approach greatly increases the amount of on-edge data from which the on-edge likelihood functions are constructed. To see this, consider a labeling strategy in which reference locations are selected at random: this yields on-edge cases only 2% of the time, so that populating the on-edge likelihood distributions requires a very large amount of data to be labeled. In contrast, when image patches are automatically preselected at specific values of rref, at the higher values of rref, which are very rarely encountered by random selection, the proportion of on-edge data is dramatically increased – constituting more than 30% of cases when rref = 0.3, and 77% of cases when rref = 0.5 (Figure 2D). More data in the on-edge likelihood distributions leads to more accurate estimates of the log-likelihood ratios that underlie the local edge probability calculation. 
Strategy for choosing informative filters
The preceding steps produced on-edge and off-edge likelihood functions for all 112 filters in the orthogonal region, tabulated separately at five different contrast levels at the reference location. Given that likelihood ratios would ultimately need to be computed, involving division operations with small, uncertain denominators, the tabulated ri data was smoothed by replacing each measured ri value with a horizontally scaled Poisson distribution with the same mean value. This reflected the “biological” assumption that the true filter value from the image would not be directly accessible for computation, since it would need to be communicated through a noisy spike train within a limited time window (see Methods for details). 
Piecewise Gaussian fits and evaluation of likelihood functions at unlabeled contrasts
We required a parametric representation that would allow us to evaluate filter likelihood functions at arbitrary reference filter contrasts, i.e., not limited to the five discrete values of rref at which labeled data was actually collected. Each Poisson-smoothed distribution was divided into three sections: (a) a delta function at the origin, which collected all negative values of the filter due to rectification, in addition to bona fide zero values; (b) the left section of the density from zero through the peak; and (c) the right section of the density from the peak to one. The left and right parts of the distribution were fit by separate Gaussian functions that met at the peak. Example fits of the on-edge and off-edge distributions are shown in Figure 3A at two reference contrast levels (upper and lower panels, respectively) for the filter at the same center but rotated 45° relative to the reference filter. 
The parameters of the fits (five total parameters: the height at the origin, Amp[0], the mode, the height at the mode, Amp[mode], and the standard deviations σleft and σright for the two Gaussians), were plotted at the five reference contrast levels for which labeled data was collected. A piecewise cubic Hermite interpolating polynomial was then fit through each of the five parameter plots for both the on-edge and off-edge distributions for each filter. Plots of the five spline-fit functions are shown in Figure 3B for a different filter, this one rotated 22.5° relative to the reference location. The spline fits allowed the parameters of the on-edge and off-edge likelihood distributions to be generated for any value of reference filter contrast; the fits (red curves) to the collected likelihood functions (black curves) are shown in Figure 3C, along with several likelihood functions at interpolated values of rref (green dashed curves). For purposes of cross validation, a new set of labeled data was collected for rref = 0.2. The resulting likelihood function (green solid curve) corresponded closely to its prediction based on interpolation of the fit parameters. 
Poisson kernel-smoothed likelihood functions are shown for a neighboring filter as thin lines in Figure 3A (black curves for on edge, magenta for off edge), along with five-parameter fits shown as superimposed thick lines and dots. 
Following the approach of Konishi et al. (2003), we used Chernoff information (see Methods) to evaluate the informativeness of the 112 filters at each of the three middle values of C (0.3, 0.5, 0.7; Figure 4A). The Chernoff information was calculated based on the Poisson-smoothed likelihood distributions. Filters were ranked within each C level (from 1 = best to 112 = worst) and the ranks for each filter were averaged across C levels, weighted by the log probability of encountering that C level in the database (see Figure 4B). For viewing convenience, the weighted ranks were inverted (newrank = 112 − oldrank) so that the best filters had the highest scores (Figure 4B). The top 30% of the filters ( = 34) were kept for further evaluation (Figure 4C). 
Figure 4
 
Selecting informative filters. (A) Chernoff information of neighboring filters at three different reference contrasts ( = = 0.3, 0.5 and 0.7). (B) Weighted average ranks over contrast levels for all neighboring filters, inverted so tall columns indicate more information. Top 30% of the 112 filters are marked in red. (C) Position and orientation of the most informative filters in the orthogonal region are shown relative to the reference location.
Figure 4
 
Selecting informative filters. (A) Chernoff information of neighboring filters at three different reference contrasts ( = = 0.3, 0.5 and 0.7). (B) Weighted average ranks over contrast levels for all neighboring filters, inverted so tall columns indicate more information. Top 30% of the 112 filters are marked in red. (C) Position and orientation of the most informative filters in the orthogonal region are shown relative to the reference location.
Independence Strategy 3: Minimizing overlap correlations
Simple physical overlap of two or more filters' regions of support can produce correlations between their outputs—consider two filters with the same center but slightly different orientations. We searched exhaustively through the 34 remaining filter candidates for subsets of N filters that had low mean absolute pairwise correlations (MAPC) between their responses:  where N = 6 and ρ(ri, rj) is the correlation between two filters i and j over all pixel locations and orientations in the image database. The distribution of MAPC values is shown in Figure 5 for the Display FormulaImage not available ≈ 1.3 million six-wise filter combinations tested. The low MAPC score for the filter set that would ultimately be chosen is marked with a red triangle, while the average MAPC value over all six-wise filter combinations is marked with a green square. Two 6 × 6 pairwise correlation matrices for the two marked cases are shown as insets. For the average case, a single representative filter set was chosen. The 3,362 filter sets with correlation scores in the lowest 0.25% of the MAPC distribution (lower red tail, including the red triangle case) were set aside for performance testing on labeled natural edges. 
Figure 5
 
Distribution of mean absolute pairwise correlations (MAPC) scores for ∼ 1.3 million six-wise combinations of the most informative filters. Two 6 × 6 pairwise correlation matrices are shown at upper right for two cases: red triangle corresponds to a filter set with one of the lowest correlation scores; this set was eventually used in the edge detection algorithm; green square shows a case with an average MAPC score. Least inter-correlated 0.25% of filter sets (left tail of distribution, shaded red) were carried forward for further processing.
Figure 5
 
Distribution of mean absolute pairwise correlations (MAPC) scores for ∼ 1.3 million six-wise combinations of the most informative filters. Two 6 × 6 pairwise correlation matrices are shown at upper right for two cases: red triangle corresponds to a filter set with one of the lowest correlation scores; this set was eventually used in the edge detection algorithm; green square shows a case with an average MAPC score. Least inter-correlated 0.25% of filter sets (left tail of distribution, shaded red) were carried forward for further processing.
Final strategy for co-optimizing filter independence and informativeness: Select for sharp tuning on natural object edges
Among the most common structures that elicit false positive responses from an edge detector are true edges that are slightly misaligned with the detector in position and/or orientation. To make the final choice among the remaining ∼ 3,400 filter sets, therefore, we set out to incorporate the likelihood functions for each filter set in turn into the rref-conditioned version of Bayes's rule (Equation 6), and measure the position and orientation tuning curves of the resulting probabilistic edge detector. The filter set with the sharpest tuning in both position and orientation would be selected. 
To generate orientation and position tuning curves for each filter set, the ∼ 3,000 image patches in the database that had been labeled as containing edges were presented to each edge detector at 16 orientations (at the reference position) and seven positions (at the reference orientation), and tuning curves were generated. Examples of tuning curves for the filter set that would eventually be chosen are shown in Figure 6AC at five levels of contrast. Full width at half maximum (FWHM) scores were extracted from each of the ∼ 3,000 tuning curves for each filter set, and the scores were averaged and histogrammed (Figure 6BD). The filter set that had the lowest average rank in the two histograms, with FWHM values marked by red triangles, was the filter set adopted for use in the edge detection algorithm. For comparison, green triangles show the FWHM values for a single linear filter at the reference location. 
Figure 6
 
Orientation and position tuning of the local edge probability (LEP) calculated for each of the ∼ 3,400 filter sets tested. (A) Example orientation tuning curves for the chosen filter set are shown at five values of . Averages for each reference value are shown as thick colored lines. Inset shows response at preferred orientation at five different levels of contrast. (B) For each tested filter set, tuning curves were generated for each of the ∼ 3,000 human-labeled edges in the database. Full width at half maximum (FWHM) values were calculated for each tuning curve, the results were averaged, and the average tuning width for that filter set was entered into the histogram. The orientation tuning score of the chosen filter set is marked by a red triangle. The much larger FWHM score for a single linear filter at the reference location is marked by the green square. (C) Positional tuning curves covering three pixels above and below the reference position. (D) Distribution of average FWHM values for the positional tuning curves. Tuning score for the chosen filter set is again marked by a red triangle, and the tuning for a linear filter at the reference location is marked by a green square.
Figure 6
 
Orientation and position tuning of the local edge probability (LEP) calculated for each of the ∼ 3,400 filter sets tested. (A) Example orientation tuning curves for the chosen filter set are shown at five values of . Averages for each reference value are shown as thick colored lines. Inset shows response at preferred orientation at five different levels of contrast. (B) For each tested filter set, tuning curves were generated for each of the ∼ 3,000 human-labeled edges in the database. Full width at half maximum (FWHM) values were calculated for each tuning curve, the results were averaged, and the average tuning width for that filter set was entered into the histogram. The orientation tuning score of the chosen filter set is marked by a red triangle. The much larger FWHM score for a single linear filter at the reference location is marked by the green square. (C) Positional tuning curves covering three pixels above and below the reference position. (D) Distribution of average FWHM values for the positional tuning curves. Tuning score for the chosen filter set is again marked by a red triangle, and the tuning for a linear filter at the reference location is marked by a green square.
We note that the average tuning width of our LEP filter is broader than that of the simple cells studied by Gardner et al. (1999) (48° vs. 32°, FWHM), but the comparison must be interpreted in light of the fact that (a) our underlying linear filter was also more broadly tuned than theirs (82° vs. 56°), suggesting that a roughly comparable amount of nonlinear sharpening had occurred, and (b) to measure orientation tuning, we used images of natural edges, which were highly variable in form (including variation in orientation up to the limits of the red box) and were coarsely pixelated (with only 10 pixels covering the reference location), whereas Gardner et al. used optimized smooth sinusoidal gratings. 
Performance evaluation on natural images
The final filter set is depicted in Figure 7A, along with its on-edge and off-edge likelihood functions (Figure 7B) and likelihood ratios (Figure 7C) conditioned on a reference filter value of 0.3. Likelihood functions and ratios at higher and lower values of rref were similar in form, but were pushed towards higher or lower ends of the ri range, respectively. 
Figure 7
 
The set of six neighboring filters finally chosen for the local edge probability computation. (B) The on-edge (red) and off-edge (blue) likelihoods for each of the six neighboring filters when rref = 0.3. (C) Likelihood ratios (i.e., ratio of red and blue curves in B) for each filter.
Figure 7
 
The set of six neighboring filters finally chosen for the local edge probability computation. (B) The on-edge (red) and off-edge (blue) likelihoods for each of the six neighboring filters when rref = 0.3. (C) Likelihood ratios (i.e., ratio of red and blue curves in B) for each filter.
Figure 8A shows a scatter plot of linear reference responses versus calculated edge probability at all positions and orientations in the image shown in Figure 8C. Notably, for a given linear score plotted on the horizontal axis, substantial variation was seen in the calculated edge probability. To determine whether this vertical spread was consistent with the judgments of human labelers, we identified image patches at the 10th (blue dots) and 90th (red dots) percentiles of the LEP range in five evenly spaced bins along the rref axis. The image patches are shown in Figure 8B, along with their corresponding LEP scores. Inspection of the patches confirm that edge probability within the reference boxes (according to the scoring rubric of Table 1) was much higher for the 90th percentile cases (top row) than the 10th percentile cases (bottom row). To extend this type of comparison to a more global perspective, we located all sites in the image where the linear score was between 0.12 and 0.38 (corresponding to all gray dots in Figure 8A). All cases above the 80th percentile in the LEP score distribution (i.e., above the red line in Figure 8A) were presumptive “good edges” and were labeled with red line segments in Figure 8C (left frame). Similarly, all cases below the 20th percentile of the LEP distribution for the corresponding linear score (below the blue line in Figure 8A) were presumptive “bad edges” and were labeled with blue line segments in Figure 8C (right frame). The upper cutoff of 0.38 on the linear axis was chosen because at that linear score, the edge probability reached 50% (Figure 2D), so that the visual distinction between “good” and “bad” edges within any given linear bin above that value would necessarily begin to fade. Consistent with the examples of Figure 8B, red-labeled edges were much more likely to be properly aligned and positioned relative to actual object edges, whereas blue edges were typically misaligned by a pixel or two, and/or misoriented. 
Figure 8
 
Linear response versus local edge probability. (A) Scatter plot of linear filter response versus the LEP for the image shown in C. Colored dots mark cases at the 90th (red) and 10% (blue) percentile within each of the five marked bins along the linear response axis(bin width = 0.02). (B) Image patches corresponding to marked examples in A are shown with their corresponding LEP scores. Note the much higher LEP scores, and edge probability, in top versus bottom row. (C) All image locations corresponding to the scatter plot in A with LEP scores over the 80th percentile (red line in A) were marked with red line segments in the left panel, and all locations below the 20th percentile (blue line in A) were marked by blue line segments in the right panel. Red lines are generally well aligned with object edges whereas most blue lines are misplaced or misoriented.
Figure 8
 
Linear response versus local edge probability. (A) Scatter plot of linear filter response versus the LEP for the image shown in C. Colored dots mark cases at the 90th (red) and 10% (blue) percentile within each of the five marked bins along the linear response axis(bin width = 0.02). (B) Image patches corresponding to marked examples in A are shown with their corresponding LEP scores. Note the much higher LEP scores, and edge probability, in top versus bottom row. (C) All image locations corresponding to the scatter plot in A with LEP scores over the 80th percentile (red line in A) were marked with red line segments in the left panel, and all locations below the 20th percentile (blue line in A) were marked by blue line segments in the right panel. Red lines are generally well aligned with object edges whereas most blue lines are misplaced or misoriented.
To examine more closely what accounted for the spread in LEP values for a fixed linear reference score, we extracted image patches from the top and bottom of the LEP range for a linear reference score of 0.3. The two image patches are shown in Figure 9A, and the corresponding log likelihood ratios, the fundamental quantities summed to determine the LEP according to Equation 6, are shown in Figure 9C. The consistent positive values for the top case versus the two large negative values in the bottom case explain the very different LEP scores (0.65 vs. 0.0). 
Figure 9
 
Illustration of local edge probability computation at two locations with same linear score but very different LEPs. (A) Image patches with marked reference locations. Linear filter response is same ( = 0.3) in both patches. (B) Log likelihood ratio curves, and values marked with red and blue symbols for the six neighboring filters applied to the upper and lower image patches, respectively. (C) Log likelihood ratios shown as bar heights. Resulting LEP values are shown above and below the image patches in A.
Figure 9
 
Illustration of local edge probability computation at two locations with same linear score but very different LEPs. (A) Image patches with marked reference locations. Linear filter response is same ( = 0.3) in both patches. (B) Log likelihood ratio curves, and values marked with red and blue symbols for the six neighboring filters applied to the upper and lower image patches, respectively. (C) Log likelihood ratios shown as bar heights. Resulting LEP values are shown above and below the image patches in A.
As a final form of evaluation, we computed the local edge probability at every pixel position and orientation in the luminance channel of entire images, and displayed the maximum LEP value over all orientations as each pixel's grayscale value (scaled between zero and 255, with darker pixels indicating higher edge probability). We referred informally to the overall edge detection algorithm as rm* (for “response based on multiple *riented filters”). Example images are shown in Figure 10, in comparison to a graded Canny-like algorithm (PbCanny) developed at UC Berkeley (D. R. Martin et al., 2004). We found that the rm* algorithm, with no free parameters, does a good job extracting bona fide local edge structure at the five-pixel length scale. 
Figure 10
 
Results of applying the rm* algorithm to natural images. Maximum value of local edge probability across all orientations is shown at each pixel as the gray level. PbCanny results were generated with scale parameter of one.
Figure 10
 
Results of applying the rm* algorithm to natural images. Maximum value of local edge probability across all orientations is shown at each pixel as the gray level. PbCanny results were generated with scale parameter of one.
Discussion
We developed a strategy for culling a set of informative, class conditionally independent filters from the large population of oriented filters surrounding a candidate edge, from which the local edge probability could be calculated based on excitatory and inhibitory interactions among a few “simple cells.” The approach was motivated by a scientific goal—to gain insight into the possible cell-cell interactions in visual cortex that underlie the better edge detection performance of real simple cells compared to standard models of simple cells, and an engineering goal—to develop a local oriented edge detector that would provide high quality input to later stages of contour processing. Key features of our approach include (a) the use of a guided human labeling process in which image patches are automatically preselected to minimize labeling workload, (b) a focus on achieving filter independence, including a strict reliance on filters in the region orthogonal rather than tangential to the edge, and (c) the use of a relatively elaborate parametric model tailored to the peculiar forms of the on- and off-edge likelihood functions measured in the surrounding filter population. The resulting edge detection algorithm, which we informally termed rm*, calculates local edge probability as a linear combination of excitatory and inhibitory influences from surrounding filters (Equation 3). The detector has no free parameters, is much more sharply tuned in position and orientation on natural object edges than the underlying linear filters that drive it, and does a good job capturing the “semantics” of edges on a five-pixel length scale in natural images. 
Calculating local edge probability in the luminance channel at a single scale, a function routinely ascribed to cortical simple cells, is only one of the first operations needed to implement a full blown line drawing extraction process that is able to cope with the complexities of natural images. At a minimum, cues from the tangential regions surrounding an edge must be incorporated to capture contour continuity statistics (Adini et al., 1997; Bauer & Heinze, 2002; Choe & Miikkulainen, 1998; Dresp, 1993; Elder & Zucker, 1998; Field et al., 1993; Geisler et al., 2001; Grossberg & Williamson, 2001; Guy & Medioni, 1992; Kapadia et al., 1995; Kapadia et al., 2000; Kourtzi & Huberle, 2005; Kourtzi et al., 2003; W. Li & Gilbert, 2002; Z. Li, 1998; Parent & Zucker, 1989; Polat et al., 1998; Polat & Sagi, 1993; Ross et al., 2000; Sha'asua & Ullman, 1988; Sigman et al., 2001; VanRullen et al., 2001; Williams & Jacobs, 1997; Yen & Finkel, 1998), multiple color channels must be combined (Fine, MacLeod, & Boynton, 2003; Zhou & Mel, 2008), other boundary-defining cues such as texture, depth, and motion must be included (Jiang & Bunke, 1999; Malik, Belongie, Leung, & Shi, 2001; Stein, Hoiem, & Hebert, 2007; Will, Hermes, Buhmann, & Puzicha, 2000), and region-based segmentation cues, which are complementary to contour cues, must be brought in as well (Arbeláez, Maire, Fowlkes, & Malik, 2011; Dresp & Grossberg, 1997; D. R. Martin et al., 2004). Many of these higher order feature extraction, cue combination, and grouping processes will depend, however, on the quality of information passed up from the “simple cell” layer. It is thus crucial to avoid geometric assumptions about contour structure in early stages of processing that do not reflect natural image statistics, or to apply ad hoc thresholding nonlinearities that suppress weak edge cues too early in the process. In these respects, the rm* edge detector is well-suited to provide input to subsequent processing stages. The algorithm has no free parameters since it is parameterized entirely from natural edge statistics, and contains no ad hoc nonlinearities since the computation flows entirely from Bayes's rule. 
Are the independence strategies used here biologically plausible?
It is a reasonable assume that the visual system, through evolution, has adopted an approach to combining cues that is (a) easy to implement with neural hardware, (b) quick to learn from data, and (c) easily extensible when additional cues become available. All of these goals are facilitated if the neural system can find a way to create/extract a set of informative, class-conditionally independent variables from the surrounding population of cells. 
Do the three independence-enhancing strategies we have used have biological counterparts? The first strategy, limiting consideration to filters in the orthogonal region surrounding an edge while excluding tangentially aligned filters, is simply a statement that breaking the overall population-decoding operation into two subcomputations, one where class-conditional independence can potentially hold, which allows a simple decoding rule, and another where the assumption very likely does not hold so that a different decoding operation is needed, leads to the classical advantages of functional modularity—a simpler overall computation, quicker learning, greater adaptability, etc. While we cannot point to direct evidence that the cortical computation of edge probability is modularized in this way, the notion that modular local circuits exist at the level of the cortical column is far from exotic (Douglas & Martin, 1991; Nessler, Pfeiffer, Buesing, & Maass, 2013; Szentagothai, 1978). As for the second strategy—suppressing higher order correlations by conditioning the ON and OFF-edge likelihood functions on fixed values of rref, as previously mentioned this approach is closely related to divisive normalization, which is considered to be a canonical cortical operation (Carandini & Heeger, 2012). The third strategy, identifying mutually decorrelated subsets of nearby filters, was accomplished here using an exhaustive search, but a similar result could likely be achieved using a biologically plausible learning rule (e.g., Gerhard, Savin, & Triesch, 2009). 
Do simple cells calculate local edge probability?
The standard model of a simple cell has a linear filter at its core (Carandini & Heeger, 1994; Carandini, Heeger, & Movshon, 1997; Daugman, 1985; Gregory C. DeAngelis, Ohzawa, & Freeman, 1993; Jones & Palmer, 1987; Priebe & Ferster, 2012). Interestingly, Gardner et al. (1999) found that when a simple cell receptive field in cat visual cortex is modeled by the best fitting oriented linear filter, the simple cell's tuning curve is typically much sharper than the linear model predicts (FWHM = 32° vs. 56° on average). This finding led the authors to conclude that simple cell tuning is enhanced by nonlinear mechanisms. 
Perhaps the simplest mechanism that can sharpen tuning is an output nonlinearity such as a quadratic function (Heeger, 1992; Miller & Troyer, 2002; Ohzawa, Deangelis, & Freeman, 1990; Pollen, Gaska, & Jacobson, 1989; Priebe & Ferster, 2012). As previously discussed, however, applying an accelerating nonlinearity, or in the limit a hard threshold, after the linear filtering stage does not fundamentally improve edge detection performance, since it remaps any given linear response level to the same ordinate value regardless of whether it was caused by an edge or nonedge (see Figure 8). In contrast, inhibitory mechanisms, including divisive normalization, are known in certain contexts to be beneficial for probabilistic edge detection (Zhou & Mel, 2008), and could in principle sharpen orientation tuning by suppressing responses to improperly oriented stimuli. However, suppression within the classical receptive field (CRF) is generally either weakly tuned to orientation, or peaked at the cell's preferred orientation (Carandini et al., 1997; G. C. DeAngelis, Robson, Ohzawa, & Freeman, 1992; Xing, Ringach, Hawken, & Shapley, 2011), and thus cannot account for sharpened tuning within the CRF, which requires relatively greater suppression at non-preferred orientations. Suppression of V1 responses from outside the CRF is also generally peaked at the cell's preferred orientation (Blakemore & Tobin, 1972; Cavanaugh, Bair, & Movshon, 2002; Webb, Dhruv, Solomon, Tailby, & Lennie, 2005), but in any case could not explain the sharper-than-expected tuning reported by Gardner et al. (1999) which arose from stimuli confined to the CRF. 
An alternative explanation suggested by our findings is that the unexpectedly sharp tuning of simple cells could be due to a Bayesian local edge probability calculation like the one proposed here, involving mixed, and in some cases nonmonotonic excitatory-inhibitory interactions between cells (Figure 9B). 
What does the LEP calculation tell us about cortical processing?
A long term goal of this work has been to gain insight into the local circuit computations in V1 that underlie natural edge detection. What can the LEP computation tell us about this? Equation 5 can be rewritten in a form that is more neural in flavor:  where the LLR terms (standing for log likelihood ratio) are functions that capture the net excitatory/inhibitory effect that each neighboring filter exerts on the edge probability at a reference location. (Equation 6 is of the same form, but breaks the overall computation into subcomputations for different reference contrast levels; Equation 5, which is simpler, would be sufficient assuming a divisive normalization can be found that maps the surrounding population of filter values into a standard contrast range, so that a single set of LLR functions can be used for all reference contrast levels). The LEP computation is similar to that of a two-layer feed-forward neural network, in the sense that the output, which we hypothetically ascribe to a simple cell, is a sigmoidal function of a sum of nonlinearly transformed filter outputs from the first layer. There is a complexity, however. In a standard neural network, in lieu of distinct LLR functions “customized” for each input unit, the network would generally apply a standard nonlinear activation function, such as a sigmoid, to all first layer units. According to Bayes's rule, however, even in the case of independent first layer cues, the effect that an input unit has on the output “hypothesis” is not of a standard nonlinear form, and cannot in general be represented by a simple monotonic function. In particular, an input unit does not exclusively excite or exclusively inhibit an output as in a typical neural network; it may do either depending on the situation (Figure 9). The need for a bipolar interaction between units can be appreciated by imagining a simplified scenario in which two cells represent edges at two nearby orientations at the same visual field location. From a naive perspective, given that Cell 1 and Cell 2 represent slightly different edge hypotheses, only one of which can be correct at any given time, they are in a competitive relationship and should therefore inhibit each other so that one wins and the other loses. From a different perspective, however, when a bona fide edge is present at the preferred orientation of Cell 1, Cell 2 will generally also be activated (to a lesser degree) given that its receptive field overlaps heavily with that of Cell 1. From this perspective, Cell 2's activity is evidence for Cell 1's edge (and vice versa), suggesting that a mutually excitatory interaction between the two cells is appropriate. 
This “paradox” is resolved as follows. In the presence of an edge aligned with Cell 1, Cell 2 will normally be firing at a reduced level compared to cell 1 (e.g., 60% of Cell 1's response) as dictated by its tuning curve. When this particular relative firing pattern occurs (r1 = 100%, r2 = 60%), Cell 2 provides positive evidence for Cell 1's hypothesis. If on the other hand Cell 2 is firing either too weakly (r2 < < 60%) or too strongly (r2 > > 60%) compared to Cell 1, Cell 2 provides negative evidence for Cell 1's hypothesis. This relative-levels effect is the source of the variable excitatory-inhibitory interaction within the Bayesian probability calculation, and the nonmonotonic LLR functions that occur in some cases as shown in Figure 9. Interestingly, the fact that nearby cells in the cortex do generally connect to each other through both excitatory and inhibitory pathways (Anderson, Carandini, & Ferster, 2000; Bonds, 1989; Isaacson & Scanziani, 2011; Priebe & Ferster, 2012) indicates that nearby cells are in principle capable of providing net positive or net negative evidence to each other depending on their relative firing rates, but determining whether this actually occurs along the lines proposed here will require further experiments. 
Some expected challenges in interpreting experimental data
In carrying out such experiments, it is worth noting there will be significant challenges in interpreting the data. It is tempting to interpret LLR functions derived from Bayes's rule as literal predictions as to how simple cells in the visual cortex should act on their neighbors as a function of their relative offsets in position and orientation, assuming the goal of the local circuit is edge detection. If simultaneous recordings from pairs of nearby simple cells were to show mixed excitatory/inhibitory effects that depended systematically, as we predict, on their relative positions, orientations, and firing rates, this would constitute support for the idea that simple cells compute edge probability through their local interconnection network. However, such a finding would also raise questions, since, as discussed above, Equation 7 is a feed-forward computation, and it is not clear how this computation should be mapped onto a recurrent network of simple cells in the cortex. A recurrent model would presumably be initialized by seeding each cell with its underlying linear filter value, and then allowing the network to evolve according to its dynamics to a fixed point where all cells represent local edge probability. Our formulation would need to be adapted to function in this way before solid predictions about the local circuitry could be made. 
If on the other hand recordings from nearby cells failed to reveal location and activity-level-dependent interactions between simple cells like those in Figure 9, this would not rule out that local edge probability is calculated in V1. The calculation could be carried out by a feed-forward projection from a population of “simpler” (i.e., more linear) simple cells onto a population of more edge-savvy simple cells, or perhaps directly onto the dendrites of complex cells, if those cells have separately represented simple cell subunits (Archie & Mel, 2000; Mel, Ruderman, & Archie, 1998). A hybrid scheme combining both recurrent and feed-forward approaches might also exist in V1, further complicating the interpretation of experimental results. Yet another possibility is that LEP-sensitive simple cells are generated by the feed-forward projection from the LGN onto V1 Layer 4 neurons, where different dendrites of each simple cell might provide the multiple subunits needed for the LEP calculation, and where inhibitory interneurons deliver properly scaled amounts of inhibition. 
The level-dependent interactions between neighboring filters that fall out of the Bayesian edge probability calculation also complicate the interpretation of edge co-occurrence statistics in natural images (Geisler et al., 2001; Sigman et al., 2001), as well as pairwise interactions between fixed-contrast edge elements measured at the neurophysiological or psychophysical levels (Field et al., 1993; Kapadia et al., 1995; Kapadia et al., 2000). All of these types of data lead to butterfly-shaped scalar association fields as discussed in the Introduction. It is certainly possible for the interaction between neighboring filter sites in a Bayesian edge probability calculation to be purely excitatory or purely inhibitory, and indeed some of the filters in our chosen filter set are nearly purely inhibitory “towards” the reference location (Figure 9C, Filter r1). Likewise, inputs to the reference location from tangentially aligned filters (which we excluded from consideration given their statistical dependencies) may in some cases be nearly purely excitatory (Polat & Sagi, 1993), though interestingly, Polat et al. (1998) reported nonmonotonic modulation effects in visual cortical neurons for tangentially aligned cues of increasing contrast in a contour detection task. In general, it is most revealing to characterize the pairwise interactions centered on a reference location/orientation not as excitatory or inhibitory per se, but in terms of log likelihood ratios as in Figure 9, which are non-trivial functions of the filter responses at both locations. 
Relationships to previous work
The basic idea that a population of neurons can be decoded to yield improved estimates of sensory quantities has received considerable attention in the literature (Beck et al., 2008; Deneve, Latham, & Pouget, 1999; Georgopoulos, Schwartz, & Kettner, 1986; Gilbert & Wiesel, 1990; Hinton, 1986; Ludtke & Wilson, 2003; Ma, Beck, Latham, & Pouget, 2006; Paradiso, 1988; Pouget, Dayan, & Zemel, 2000; Sanger, 1996; Vogels, 1990; Zemel, Dayan, & Pouget, 1998), and a number of biologically inspired edge/contour detection systems have been proposed based on simple cell-like filters modulated by center-surround interactions (Grigorescu, Petkov, & Westenberg, 2003; Morrone & Burr, 1988; Sajda & Finkel, 1995; Zhang & von der Heydt, 2010). 
Surprisingly few edge/contour detection algorithms have been based on probabilities and measured natural image statistics, however (Dollar et al., 2006; Konishi et al., 2003; D. R. Martin et al., 2004), and to our knowledge, none so far has specifically addressed the way a population of oriented cells should interact to compute local edge probability. Martin et al. (2004) and Dollar et al. (2006) used human-labeled natural image data to parameterize their contour detection algorithms, but rather than using an explicit Bayesian formulation, they directly trained conventional classifiers with the ground truth data. The two approaches differed in that D. R. Martin et al. (2004) used a carefully designed a set of boundary-related features and trained a logistic regression model (corresponding to a conventional two-layer neural network), whereas Dollar et al. (2006) trained a probabilistic boosting tree using a large numbers of simple spatial filters. 
The previous work most similar to ours is that of Konishi et al., 2003, who also collected on- and off-edge likelihood functions for simple filters based on human-labeled ground truth data. The main differences in our approaches are that: (a) we define the edge hypothesis to include orientation as well as position, on the grounds that the more homogeneous set of image patches containing edges of a particular orientation (defined by our reference location) will lead to more informative likelihood functions in the surrounding filter population (i.e., greater differences between on- and off-edge likelihoods); (b) labeling edges conditioned on different contrast levels at the reference location allows us to estimate the needed likelihood functions with a minimum of human labeling effort; (c) in lieu of preselecting a small number of filter types/locations, we evaluated 112 oriented linear filters in the orthogonal region cutting perpendicularly across a candidate edge (±3 Pixels × 16 Orientations), and culled out the most informative ones using Chernoff information; and (d) we took explicit measures to minimize correlations between the filters we would use, so that we could factorize the on- and off-edge likelihood functions. 
Performance evaluation
A large number and variety of edge detection methods have been developed over the years (see http://iris.usc.edu/Vision-Notes/bibliography/contentsedge.html for an annotated list of papers on various methods, and Papari & Petkov, 2011, for recent review). Evaluating the relative quality of edge/contour detection algorithms is intrinsically a difficult task, however. Ground truth edge databases have been developed for this purpose (Bowyer, Kranenburg, & Dougherty, 1999; Geisler & Perry, 2009; D. Martin, Fowlkes, Tal, & Malik, 2001), which differ in terms of (a) their specificity of labeling, i.e., whether the ground truth includes location, orientation, scale, and edge polarity as in our approach, versus location only, or anything in between; (b) their accuracy of labeling, i.e., within a few pixels versus subpixel accuracy; and (c) the method for selecting what to label, for example, every pixel versus a random subsample versus an automatically selected subsample versus “label what you want.” Depending on these choices, benchmarking results may be more or less helpful in comparing the quality of different algorithms. For a Bayesian approach such as ours, we would ideally want a “complete” set of image locations/orientations accurately labeled for edge probability. Precision-recall (PR) curves would then provide a useful metric of edge detection performance. The main practical challenge in producing reliable PR curves is pinning down the on-edge likelihood distribution. Knowing the true edge probability at the low end of a filter's response range is (a) difficult because in order to accurately determine the edge probability in a low response bin (which could easily be < 0.001), a large number of edges in that bin must be accurately labeled; and (b) important because the prior probability of being in these low response bins is very high (Figure 2B), so that the overall shape of the on-edge likelihood distribution is strongly affected by the edge probability estimates in these bins. Low response bins can also become contaminated with large numbers of false positive edges when edges are either inaccurately labeled (i.e., off by a pixel or two) or when labeling is left to human discretion: Human labelers are often tempted to label at a spatial scale or mixture of scales that may be mismatched to the (unknown) scale(s) of the edge detectors that will later be evaluated, such as labeling the smooth outer perimeter of a tree or bush. This is crucial given that both the existence and orientation of edges in natural images are scale-dependent concepts. Human labelers may also reject certain classes of strong edges based on a perceived lack of importance, for example the stripes produced by window shades, or the large number of uninteresting strong edges contained within textures. Given that unimportant or uninteresting strong edges may constitute a large fraction of all strong edges in natural images, these labeling choices can dilute edge probability at the strong-response end of the on-edge likelihood distribution. Taken together, these effects can produce a substantial rearrangement of the probability density in the on-edge distribution along the detector's response axis, mostly pushing towards the low end of the response range. The greater overlap of the on- and off-edge distributions that this rearrangement causes leads to the appearance of reduced edge detection performance in a precision-recall analysis. Presumably for reasons such as these, benchmarking scores, and the rankings they generate on specific images, often seem visually unintuitive. 
Given the difficulties in interpreting PR curves, we opted instead to assess edge detection performance by (a) comparing tuning curves of the rm* algorithm with those of the underlying linear filters: We found that tuning in position and orientation was significantly sharpened by the population-based probability calculation; (b) verifying that the spread in local edge probability for a fixed linear filter score was consistent with the judgments of human labelers; this acted as a form of cross validation since the examples examined were not part of the ground truth training data (Figure 8); and (c) making qualitative visual comparisons between maps of edge probability generated by the rm* algorithm compared to other edge/contour-detection algorithms (including a graded Canny variant [implementation from the Berkeley boundary detection benchmark toolset], see Figure 10). Overall, we found that despite having no free parameters, rm* responses were well tuned on natural edges, were predictive of human judgments on unlabeled data, and produced semantically well-structured edge maps on a five-pixel length scale. 
Conclusion
The probabilistic approach to edge detection described here can likely be adapted to other types of visual features. However, the constraint that the probability calculation must be expressible in terms of sums of positive and negative interactions among nearby cells, tied to the CCI assumption, means that the process we have outlined here, whether applied to edges or other features, can only be a first stage in a multistage process. Nonetheless, the ability to break a complex natural feature-extraction process into a first quasilinear stage where cue independence roughly holds, followed by additional processing stages where bona fide nonlinear interactions can occur, has the advantage of modularity, and seems likely to simplify the overall computational scheme. Future work will be needed to determine whether this type of breakdown into independent (linear) versus dependent (nonlinear) cue processing is a rare or common feature of biological sensory systems. 
Acknowledgments
We thank BoscoTjan, Norberto Gryzwacz for numerous useful discussions in the course of this work. This work was funded by NIH/NEI BRP grant EY016093. 
Commercial relationships: yes. Part of this work has been applied for patent protection under US Patent Application 13/894,276. 
Corresponding author: Chaithanya A Ramachandra. 
Email: cramacha@usc.edu. 
Address: Department of Biomedical Engineering, University of Southern California, Los Angeles, CA, USA. 
References
Adini Y. Sagi D. Tsodyks M. (1997). Excitatory–inhibitory network in the visual cortex: Psychophysical evidence. Proceedings of the National Academy of Sciences, USA, 94 (19), 10426–10431. [CrossRef]
Anderson J. S. Carandini M. Ferster D. (2000). Orientation tuning of input conductance, excitation, and inhibition in cat primary visual cortex. Journal of Neurophysiology, 84 (2), 909–926. [PubMed]
Arbeláez P. Maire M. Fowlkes C. Malik J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33 (5), 898–916. doi:10.1109/TPAMI.2010.161. [CrossRef] [PubMed]
Archie K. A. Mel B. W. (2000). A model for intradendritic computation of binocular disparity. Nature Neuroscience, 3 (1), 54–63. [CrossRef] [PubMed]
Bauer R. Heinze S. (2002). Contour integration in striate cortex. Experimental Brain Research, 147 (2), 145–152. doi:10.1007/s00221-002-1178-6. [CrossRef] [PubMed]
Beck J. M. Ma W. J. Kiani R. Hanks T. Churchland A. K. Roitman J., … Shadlen M. N. (2008). Probabilistic population codes for Bayesian decision making. Neuron, 60 (6), 1142–1152. doi:10.1016/j.neuron.2008.09.021. [CrossRef] [PubMed]
Biederman I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94 (2), 115–147. doi:10.1037/0033-295X.94.2.115. [CrossRef] [PubMed]
Biederman I. Ju G. (1988). Surface versus edge-based determinants of visual recognition. Cognitive Psychology, 20 (1), 38–64. doi:10.1016/0010-0285(88)90024-2. [CrossRef] [PubMed]
Blakemore C. Tobin E. A. (1972). Lateral inhibition between orientation detectors in the cat's visual cortex. Experimental Brain Research, 15 (4), 439–440. [PubMed]
Bonds A. B. (1989). Role of inhibition in the specification of orientation selectivity of cells in the cat striate cortex. Visual Neuroscience, 2 (01), 41–55. doi:10.1017/S0952523800004314. [CrossRef] [PubMed]
Bowyer K. Kranenburg C. Dougherty S. (1999). Edge detector evaluation using empirical ROC curves. In Computer Vision and Pattern Recognition, 1999. IEEE Computer Society Conference on ( Vol. 1, p. 359). doi:10.1109/CVPR.1999.786963.
Carandini M. Heeger D. (1994). Summation and division by neurons in primate visual cortex. Science, 264 (5163), 1333–1336. doi:10.1126/science.8191289. [CrossRef] [PubMed]
Carandini M. Heeger D. J. (2012). Normalization as a canonical neural computation. Nature Reviews Neuroscience, 13 (1), 51–62. doi:10.1038/nrn3136.
Carandini M. Heeger D. J. Movshon J. A. (1997). Linearity and normalization in simple cells of the macaque primary visual cortex. Journal of Neuroscience, 17 (21), 8621–8644. [PubMed]
Cavanaugh J. R. Bair W. Movshon J. A. (2002). Selectivity and spatial distribution of signals from the receptive field surround in macaque V1 neurons. Journal of Neurophysiology, 88 (5), 2547–2556. [CrossRef] [PubMed]
Choe Y. Miikkulainen R. (1998). Self-organization and segmentation in a laterally connected orientation map of spiking neurons. Neurocomputing, 21 (1–3), 139–158. doi:10.1016/S0925-2312(98)00040-X. [CrossRef]
Daugman J. G. (1985). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Optical Society of America, Journal, A: Optics and Image Science, 2 (7), 1160–1169. [CrossRef]
DeAngelis G. C. Ohzawa I. Freeman R. D. (1993). Spatiotemporal organization of simple-cell receptive fields in the cat's striate cortex. II. Linearity of temporal and spatial summation. Journal of Neurophysiology, 69 (4), 1118–1135. [PubMed]
DeAngelis G. C. Robson J. G. Ohzawa I. Freeman R. D. (1992). Organization of suppression in receptive fields of neurons in cat visual cortex. Journal of Neurophysiology, 68 (1), 144–163. [PubMed]
DeCarlo D. (2008). Perception of line drawings. Presented at the SIGGRAPH 2008, Los Angeles, CA. Retrieved from http://gfx.cs.princeton.edu/proj/sg08lines/lines-7-perception.pdf.
Deneve S. Latham P. E. Pouget A. (1999). Reading population codes: A neural implementation of ideal observers. Nature Neuroscience, 2 (8), 740–745. doi:10.1038/11205. [PubMed]
Dollar P. Tu Z. Belongie S. (2006). Supervised learning of edges and object boundaries. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 ( pp. 1964–1971). Retrieved from http://portal.acm.org/citation.cfm?id=1153171.1153683.
Douglas R. J. Martin K. A. (1991). A functional microcircuit for cat visual cortex. The Journal of Physiology, 440 (1), 735–769. [CrossRef] [PubMed]
Dresp B. (1993). Bright lines and edges facilitate the detection of small light targets. Spatial Vision, 7 (3), 213–225. doi:10.1163/156856893X00379. [CrossRef] [PubMed]
Dresp B. Grossberg S. (1997). Contour integration across polarities and spatial gaps: From local contrast filtering to global grouping. Vision Research, 37 (7), 913–924. doi:10.1016/S0042-6989(96)00227-1. [CrossRef] [PubMed]
Elder J. H. Zucker S. W. (1998). Local scale control for edge detection and blur estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20 (7), 699–716. doi:10.1109/34.689301. [CrossRef]
Field D. J. Hayes A. Hess R. F. (1993). Contour integration by the human visual system: Evidence for a local “association field.” Vision Research, 33 (2), 173–193. [CrossRef] [PubMed]
Fine I. MacLeod D. I. A. Boynton G. M. (2003). Surface segmentation based on the luminance and color statistics of natural scenes. Journal of the Optical Society of America A, 20 (7), 1283–1291. doi:10.1364/JOSAA.20.001283. [CrossRef]
Fiser J. Berkes P. Orbán G. Lengyel M. (2010). Statistically optimal perception and learning: From behavior to neural representations. Trends in Cognitive Sciences, 14 (3), 119–130. doi:10.1016/j.tics.2010.01.003. [CrossRef] [PubMed]
Gardner J. L. Anzai A. Ohzawa I. Freeman R. D. (1999). Linear and nonlinear contributions to orientation tuning of simple cells in the cat's striate cortex. Visual Neuroscience, 16 (6), 1115–1121. [CrossRef] [PubMed]
Geisler W. S. Perry J. S. (2009). Contour statistics in natural images: Grouping across occlusions. Visual Neuroscience, 26 (1), 109–121. doi:10.1017/S0952523808080875. [CrossRef] [PubMed]
Geisler W. S. Perry J. S. Super B. J. Gallogly D. P. (2001). Edge co-occurrence in natural images predicts contour grouping performance. Vision Research, 41 (6), 711–724. [CrossRef] [PubMed]
Georgopoulos A. P. Schwartz A. B. Kettner R. E. (1986). Neuronal population coding of movement direction. Science (New York, N.Y.), 233 (4771), 1416–1419. [CrossRef] [PubMed]
Gerhard F. Savin C. Triesch J. (2009). A robust biologically plausible implementation of ICA-like learning. European Symposium on Artificial Neural Networks – Advances in Computational Intelligence and Learning. Bruges, Belgium, April 22-24, 2009. ESANN, Retrieved from https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2009-74.pdf.
Gilbert C. D. Wiesel T. N. (1990). The influence of contextual stimuli on the orientation selectivity of cells in primary visual cortex of the cat. Vision Research, 30 (11), 1689–1701. doi:10.1016/0042-6989(90)90153-C. [CrossRef] [PubMed]
Grigorescu C. Petkov N. Westenberg M. (2003). Contour detection based on nonclassical receptive field inhibition. Image Processing, IEEE Transactions on, 12(7), 739, 729. [CrossRef]
Grossberg S. Williamson J. R. (2001). A neural model of how horizontal and interlaminar connections of visual cortex develop into adult circuits that carry out perceptual grouping and learning. Cerebral Cortex, 11 (1), 37–58. doi:10.1093/cercor/11.1.37. [CrossRef] [PubMed]
Guy G. Medioni G. (1992). Perceptual grouping using global saliency-enhancing operators. In 11th IAPR International Conference on Pattern Recognition, 1992. Vol.I. Conference A: Computer Vision and Applications, Proceedings ( pp. 99–103). doi:10.1109/ICPR.1992.201517.
Heeger D. J. (1992). Half-squaring in responses of cat striate cells. Visual Neuroscience, 9 (05), 427–443. doi:10.1017/S095252380001124X. [CrossRef] [PubMed]
Hinton G. (1986). Learning distributed representation of concepts (pp. 1–12). Paper presented at the Proceedings of the Eighth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Lawrence Erlbaum Associates.
Hubel D. H. Wiesel T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of Physiology, 160 (1), 106–154. [CrossRef] [PubMed]
Hyvarinen A. (1999). Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks, 10 (3), 626–634. doi:10.1109/72.761722. [CrossRef] [PubMed]
Isaacson J. S. Scanziani M. (2011). How inhibition shapes cortical activity. Neuron, 72 (2), 231–243. doi:10.1016/j.neuron.2011.09.027. [CrossRef] [PubMed]
Jacobs R. A. ( 1995). Methods for combining experts' probability assessments. Neural Computation, 7 (5), 867–888. [CrossRef] [PubMed]
Jiang X. Bunke H. (1999). Edge detection in range images based on scan line approximation. Computer Vision and Image Understanding, 73 (2), 183–199. doi:10.1006/cviu.1998.0715. [CrossRef]
Jones J. P. Palmer L. A. (1987). An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. Journal of Neurophysiology, 58 (6), 1233–1258. [PubMed]
Kapadia M. K. Ito M. Gilbert C. D. Westheimer G. (1995). Improvement in visual sensitivity by changes in local context: Parallel studies in human observers and in V1 of alert monkeys. Neuron, 15 (4), 843–856. doi:10.1016/0896-6273(95)90175-2. [CrossRef] [PubMed]
Kapadia M. K. Westheimer G. Gilbert C. D. (2000). Spatial distribution of contextual interactions in primary visual cortex and in visual perception. Journal of Neurophysiology, 84 (4), 2048–2062. [PubMed]
Karklin Y. Lewicki M. (2003). Learning higher-order structures in natural images. Network: Computation in Neural Systems, 14 (3), 483–499. doi:10.1088/0954-898X/14/3/306. [CrossRef]
Karklin Y. Lewicki M. S. (2005). A hierarchical Bayesian model for learning nonlinear statistical regularities in nonstationary natural signals. Neural Computation, 17 (2), 397–423. doi:10.1162/0899766053011474. [CrossRef] [PubMed]
Konishi S. Yuille A. L. Coughlan J. M. Zhu S. C. (2003). Statistical edge detection: Learning and evaluating edge cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25 (1), 57–74. doi:10.1109/TPAMI.2003.1159946. [CrossRef]
Kording K. P. Wolpert D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427 (6971), 244–247. doi:10.1038/nature02169. [CrossRef] [PubMed]
Kourtzi Z. Huberle E. (2005). Spatiotemporal characteristics of form analysis in the human visual cortex revealed by rapid event-related fMRI adaptation. NeuroImage, 28 (2), 440–452. doi:10.1016/j.neuroimage.2005.06.017. [CrossRef] [PubMed]
Kourtzi Z. Kanwisher N. (2001). Representation of perceived object shape by the human lateral occipital complex. Science, 293 (5534), 1506–1509. doi:10.1126/science.1061133. [CrossRef] [PubMed]
Kourtzi Z. Tolias A. S. Altmann C. F. Augath M. Logothetis N. K. (2003). Integration of local features into global shapes: Monkey and human FMRI studies. Neuron, 37 (2), 333–346. [CrossRef] [PubMed]
Li W. Gilbert C. D. (2002). Global contour saliency and local colinear interactions. Journal of Neurophysiology, 88 (5), 2846–2856. doi:10.1152/jn.00289.2002. [CrossRef] [PubMed]
Li Z. (1998). A neural model of contour integration in the primary visual cortex. Neural Computation, 10 (4), 903–940. doi:10.1162/089976698300017557. [CrossRef] [PubMed]
Liang Y. Simoncelli E. P. Lei Z. (2000). Color channels decorrelation by ICA transformation in the wavelet domain for color texture analysis and synthesis. In IEEE Conference on Computer Vision and Pattern Recognition, 2000. Proceedings ( Vol. 1, pp. 606–611). doi:10.1109/CVPR.2000.855875.
Lowe D. G. (1999). Object recognition from local scale-invariant features. In The Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999 ( Vol. 2, pp. 1150–1157). doi:10.1109/ICCV.1999.790410.
Ludtke N. Wilson R. C. (2003). A mixture model for population codes of Gabor filters. IEEE Transactions on Neural Networks, 14 (4), 794–803. doi:10.1109/TNN.2003.813838. [CrossRef] [PubMed]
Ma W. J. Beck J. M. Latham P. E. Pouget A. (2006). Bayesian inference with probabilistic population codes. Nature Neuroscience, 9 (11), 1432–1438. doi:10.1038/nn1790. [CrossRef] [PubMed]
Malik J. Belongie S. Leung T. Shi J. (2001). Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43 (1), 7–27. doi:10.1023/A:1011174803800. [CrossRef]
Marr D. Vision A. (1982). A computational investigation into the human representation and processing of visual information. San Francisco: Freeman and Company.
Martin D. Fowlkes C. Tal D. Malik J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Computer Vision, 2001. Proceedings of the 8th IEEE International Conference, Vol. 2 (pp. 416–423).
Martin D. R. Fowlkes C. C. Malik J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26 (5), 530–549.
Mel B. W. Ruderman D. L. Archie K. A. (1998). Translation-invariant orientation tuning in visual “complex” cells could derive from intradendritic computations. The Journal of Neuroscience, 18 (11), 4325–4334. [PubMed]
Miller K. D. Troyer T. W. (2002). Neural noise can explain expansive, power-law nonlinearities in neural response functions. Journal of Neurophysiology, 87 (2), 653–659. [PubMed]
Morrone M. C. Burr D. C. (1988). Feature detection in human vision: A phase-dependent energy model. Proceedings of the Royal Society of London. Series B, Biological Sciences, 221–245.
Nessler B. Pfeiffer M. Buesing L. Maass W. (2013). Bayesian computation emerges in generic cortical microcircuits through spike-timing-dependent plasticity. PLoS Computational Biology, 9 (4), e1003037. [CrossRef] [PubMed]
Ohzawa I. Deangelis G. C. Freeman R. D. (1990). Stereoscopic depth discrimination in the visual cortex: Neurons ideally suited as disparity detectors. Science, 249 (4972), 1037–1041. [CrossRef] [PubMed]
Papari G. Petkov N. (2011). Edge and line oriented contour detection: State of the art. Image and Vision Computing, 29 (2–3), 79–103. doi:10.1016/j.imavis.2010.08.009. [CrossRef]
Paradiso M. A. (1988). A theory for the use of visual orientation information which exploits the columnar structure of striate cortex. Biological Cybernetics, 58 (1), 35–49. doi:10.1007/BF00363954. [CrossRef] [PubMed]
Parent P. Zucker S. W. (1989). Trace inference, curvature consistency, and curve detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11 (8), 823–839. doi:10.1109/34.31445. [CrossRef]
Parra L. Spence C. Sajda P. (2001). Higher-order statistical properties arising from the non-stationarity of natural signals. Advances in Neural Information Processing Systems, 13, 786–792.
Polat U. Mizobe K. Pettet M. W. Kasamatsu T. Norcia A. M. (1998). Collinear stimuli regulate visual responses depending on cell's contrast threshold. Nature, 391 (6667), 580–584. doi:10.1038/35372. [PubMed]
Polat U. Sagi D. (1993). Lateral interactions between spatial channels: Suppression and facilitation revealed by lateral masking experiments. Vision Research, 33 (7), 993–999. doi:10.1016/0042-6989(93)90081-7. [CrossRef] [PubMed]
Pollen D. A. Gaska J. P. Jacobson L. D. (1989). Physiological constraints on models of visual cortical function. New York, NY: Cambridge University Press.
Pouget A. Dayan P. Zemel R. (2000). Information processing with population codes. Nature Reviews Neuroscience, 1 (2), 125–132. [CrossRef] [PubMed]
Priebe N. J. Ferster D. (2012). Mechanisms of neuronal computation in mammalian visual cortex. Neuron, 75 (2), 194–208. doi:10.1016/j.neuron.2012.06.011. [CrossRef] [PubMed]
Ross W. Grossberg S. Mingolla E. (2000). Visual cortical mechanisms of perceptual grouping: Interacting layers, networks, columns, and maps. Neural Networks, 13 (6), 571–588. doi:10.1016/S0893-6080(00)00040-X. [CrossRef] [PubMed]
Sajda P. Finkel L. H. (1995). Intermediate-level visual representations and the construction of surface perception. Journal of Cognitive Neuroscience, 7 (2), 267–291. [CrossRef] [PubMed]
Sanger T. D. (1996). Probability density estimation for the interpretation of neural population codes. Journal of Neurophysiology, 76 (4), 2790–2793. [PubMed]
Schwartz O. Simoncelli E. P. (2001). Natural signal statistics and sensory gain control. Nature Neuroscience, 4 (8), 819–825. doi:10.1038/90526. [CrossRef] [PubMed]
Sha'asua A. Ullman S. (1988). Structural saliency: The detection of globally salient structures using a locally connected network. In Second International Conference on Computer Vision, pp. 321–327.
Sigman M. Cecchi G. A. Gilbert C. D. Magnasco M. O. (2001). On a common circle: Natural scenes and Gestalt rules. Proceedings of the National Academy of Sciences, USA, 98 (4), 1935–1940. doi:10.1073/pnas.98.4.1935. [CrossRef]
Stein A. Hoiem D. Hebert M. (2007). Learning to find object boundaries using motion cues. In IEEE 11th International Conference on Computer Vision, 2007 ( pp. 1–8). doi:10.1109/ICCV.2007.4408841.
Szentagothai J. (1978). The neuron network of the cerebral cortex: A functional interpretation. Proceedings of the Royal Society of London. Series B, Biological Sciences, 219–248.
Tenenbaum J. B. Kemp C. Griffiths T. L. Goodman N. D. (2011). How to grow a mind: statistics, structure, and abstraction. Science, 331 (6022), 1279–1285. doi:10.1126/science.1192788. [CrossRef] [PubMed]
VanRullen R. Delorme A. Thorpe S. (2001). Feed-forward contour integration in primary visual cortex based on asynchronous spike propagation. Neurocomputing, 38–40, 1003–1009. doi:10.1016/S0925-2312(01)00445-3. [CrossRef]
Vogels R. (1990). Population coding of stimulus orientation by striate cortical cells. Biological Cybernetics, 64 (1), 25–31. doi:10.1007/BF00203627. [CrossRef] [PubMed]
Wainwright M. J. Simoncelli E. P. (2000). Scale mixtures of Gaussians and the statistics of natural images. Advances in Neural Processing Systems, 12, 855–861.
Webb B. S. Dhruv N. T. Solomon S. G. Tailby C. Lennie P. (2005). Early and late mechanisms of surround suppression in striate cortex of macaque. The Journal of Neuroscience, 25 (50), 11666–11675. [CrossRef] [PubMed]
Weiss Y. Simoncelli E. P. Adelson E. H. (2002). Motion illusions as optimal percepts. Nature Neuroscience, 5 (6), 598–604. [CrossRef] [PubMed]
Will S. Hermes L. Buhmann J. M. Puzicha J. (2000). On learning texture edge detectors. In 2000 International Conference on Image Processing, 2000. Proceedings ( Vol. 3, pp. 877–880). doi:10.1109/ICIP.2000.899596.
Williams L. R. Jacobs D. W. (1997). Stochastic completion fields: A neural model of illusory contour shape and salience. Neural Computation, 9 (4), 837–858. doi:10.1162/neco.1997.9.4.837. [CrossRef] [PubMed]
Xing D. Ringach D. L. Hawken M. J. Shapley R. M. (2011). Untuned suppression makes a major contribution to the enhancement of orientation selectivity in macaque V1. The Journal of Neuroscience, 31 (44), 15972–15982. doi:10.1523/JNEUROSCI.2245-11.2011. [CrossRef] [PubMed]
Yen S.-C. Finkel L. H. (1998). Extraction of perceptually salient contours by striate cortical networks. Vision Research, 38 (5), 719–741. doi:10.1016/S0042-6989(97)00197-1. [CrossRef] [PubMed]
Yuille A. Kersten D. (2006). Vision as Bayesian inference: Analysis by synthesis? Trends in Cognitive Sciences, 10 (7), 301–308. doi:10.1016/j.tics.2006.05.002. [CrossRef] [PubMed]
Yuille A. L. Grzywacz N. M. (1988). A computational theory for the perception of coherent visual motion. Nature, 333 (6168), 71–74. doi:10.1038/333071a0. [CrossRef] [PubMed]
Zemel R. S. Dayan P. Pouget A. (1998). Probabilistic interpretation of population codes. Neural Computation, 10 (2), 403–430. doi:10.1162/089976698300017818. [CrossRef] [PubMed]
Zetzsche C. Rhrbein F. (2001). Nonlinear and extra-classical receptive field properties and the statistics of natural scenes. Network (Bristol, England), 12 (3), 331–350. [CrossRef] [PubMed]
Zhang N. R. von der Heydt R. (2010). Analysis of the context integration mechanisms underlying figure–ground organization in the visual cortex. The Journal of Neuroscience, 30 (19), 6482–6496. [CrossRef] [PubMed]
Zhou C. Mel B. W. (2008). Cue combination and color edge detection in natural scenes. Journal of Vision, 8 (4): 4, 1–25, http://www.journalofvision.org/content/8/4/4, doi:10.1167/8.4.4. [PubMed] [Article] [CrossRef] [PubMed]
Figure 1
 
Tangential versus orthogonal regions surrounding a candidate edge. Oriented filters in the vicinity of a reference location (marked by a red rectangle) can be loosely classified into two groups—those in the orthogonal region (in blue) and those in the tangential region (in green). Tangential edges are particularly subject to higher order correlations. For example, given an edge at the reference location, evidence for edges at Locations A and C is positive evidence for edges at B and D, but negative evidence for edges at E and F.
Figure 1
 
Tangential versus orthogonal regions surrounding a candidate edge. Oriented filters in the vicinity of a reference location (marked by a red rectangle) can be loosely classified into two groups—those in the orthogonal region (in blue) and those in the tangential region (in green). Tangential edges are particularly subject to higher order correlations. For example, given an edge at the reference location, evidence for edges at Locations A and C is positive evidence for edges at B and D, but negative evidence for edges at E and F.
Figure 2
 
The linear filter, its statistics, and its use in ground truth labeling. (A) Oriented linear filter kernel. Convolution results were rectified at zero to obtain the filter response ri. The pixel that denotes the location of the filter is marked by red shading. (B) The log pdf of filter responses measured at all locations and orientations in the database. (C) Example image patches at three linear responses levels measured at the reference location (red rectangle). (D) Probability of an edge for a given linear response (red data points). Fit to data (solid curve) is a sigmoid = 1/(1 + −s(xt)); = 9.9, = 0.3804.
Figure 2
 
The linear filter, its statistics, and its use in ground truth labeling. (A) Oriented linear filter kernel. Convolution results were rectified at zero to obtain the filter response ri. The pixel that denotes the location of the filter is marked by red shading. (B) The log pdf of filter responses measured at all locations and orientations in the database. (C) Example image patches at three linear responses levels measured at the reference location (red rectangle). (D) Probability of an edge for a given linear response (red data points). Fit to data (solid curve) is a sigmoid = 1/(1 + −s(xt)); = 9.9, = 0.3804.
Figure 3
 
Modeling likelihood functions of neighboring filters. (A) Distribution of filter responses taken at the same center but rotated 45° relative to the reference filterfor rref = 0.3 (upper panel) and rref = 0.5 (lower panel). Filter responses including rref are normalized to the range [0, 1].Red curves are for when an edge was judged to be present at the reference location, blue curves for when an edge was judged to be absent. Each panel shows Poisson-smoothed data (thin curves) and parametric fits (thick curves). (B) Plots of the five parameters used to fit the Poisson-smoothed likelihoods as a function of reference filter contrast for a different filter, depicted in the inset of Panel C. Off-edge case had only first four data points, given that only very rarely does an image patch contain no edge when rref = 0.9. (C) Examples of on-edge likelihood functions generated from the parametric model at a range of reference filter values, with the Poisson-smoothed data shown superimposed in thin black lines for the five cases for which labeled data was actually collected (red curves). Green dashed lines are on-edge likelihood functions generated from the parametric model at intermediate, unlabeled reference filter values. Generalization to new data was good: green solid line shows Poisson-smoothed data collected at rref = 0.2, which was not part of the training set (quality of fit to model prediction: r2 = 0.99).
Figure 3
 
Modeling likelihood functions of neighboring filters. (A) Distribution of filter responses taken at the same center but rotated 45° relative to the reference filterfor rref = 0.3 (upper panel) and rref = 0.5 (lower panel). Filter responses including rref are normalized to the range [0, 1].Red curves are for when an edge was judged to be present at the reference location, blue curves for when an edge was judged to be absent. Each panel shows Poisson-smoothed data (thin curves) and parametric fits (thick curves). (B) Plots of the five parameters used to fit the Poisson-smoothed likelihoods as a function of reference filter contrast for a different filter, depicted in the inset of Panel C. Off-edge case had only first four data points, given that only very rarely does an image patch contain no edge when rref = 0.9. (C) Examples of on-edge likelihood functions generated from the parametric model at a range of reference filter values, with the Poisson-smoothed data shown superimposed in thin black lines for the five cases for which labeled data was actually collected (red curves). Green dashed lines are on-edge likelihood functions generated from the parametric model at intermediate, unlabeled reference filter values. Generalization to new data was good: green solid line shows Poisson-smoothed data collected at rref = 0.2, which was not part of the training set (quality of fit to model prediction: r2 = 0.99).
Figure 4
 
Selecting informative filters. (A) Chernoff information of neighboring filters at three different reference contrasts ( = = 0.3, 0.5 and 0.7). (B) Weighted average ranks over contrast levels for all neighboring filters, inverted so tall columns indicate more information. Top 30% of the 112 filters are marked in red. (C) Position and orientation of the most informative filters in the orthogonal region are shown relative to the reference location.
Figure 4
 
Selecting informative filters. (A) Chernoff information of neighboring filters at three different reference contrasts ( = = 0.3, 0.5 and 0.7). (B) Weighted average ranks over contrast levels for all neighboring filters, inverted so tall columns indicate more information. Top 30% of the 112 filters are marked in red. (C) Position and orientation of the most informative filters in the orthogonal region are shown relative to the reference location.
Figure 5
 
Distribution of mean absolute pairwise correlations (MAPC) scores for ∼ 1.3 million six-wise combinations of the most informative filters. Two 6 × 6 pairwise correlation matrices are shown at upper right for two cases: red triangle corresponds to a filter set with one of the lowest correlation scores; this set was eventually used in the edge detection algorithm; green square shows a case with an average MAPC score. Least inter-correlated 0.25% of filter sets (left tail of distribution, shaded red) were carried forward for further processing.
Figure 5
 
Distribution of mean absolute pairwise correlations (MAPC) scores for ∼ 1.3 million six-wise combinations of the most informative filters. Two 6 × 6 pairwise correlation matrices are shown at upper right for two cases: red triangle corresponds to a filter set with one of the lowest correlation scores; this set was eventually used in the edge detection algorithm; green square shows a case with an average MAPC score. Least inter-correlated 0.25% of filter sets (left tail of distribution, shaded red) were carried forward for further processing.
Figure 6
 
Orientation and position tuning of the local edge probability (LEP) calculated for each of the ∼ 3,400 filter sets tested. (A) Example orientation tuning curves for the chosen filter set are shown at five values of . Averages for each reference value are shown as thick colored lines. Inset shows response at preferred orientation at five different levels of contrast. (B) For each tested filter set, tuning curves were generated for each of the ∼ 3,000 human-labeled edges in the database. Full width at half maximum (FWHM) values were calculated for each tuning curve, the results were averaged, and the average tuning width for that filter set was entered into the histogram. The orientation tuning score of the chosen filter set is marked by a red triangle. The much larger FWHM score for a single linear filter at the reference location is marked by the green square. (C) Positional tuning curves covering three pixels above and below the reference position. (D) Distribution of average FWHM values for the positional tuning curves. Tuning score for the chosen filter set is again marked by a red triangle, and the tuning for a linear filter at the reference location is marked by a green square.
Figure 6
 
Orientation and position tuning of the local edge probability (LEP) calculated for each of the ∼ 3,400 filter sets tested. (A) Example orientation tuning curves for the chosen filter set are shown at five values of . Averages for each reference value are shown as thick colored lines. Inset shows response at preferred orientation at five different levels of contrast. (B) For each tested filter set, tuning curves were generated for each of the ∼ 3,000 human-labeled edges in the database. Full width at half maximum (FWHM) values were calculated for each tuning curve, the results were averaged, and the average tuning width for that filter set was entered into the histogram. The orientation tuning score of the chosen filter set is marked by a red triangle. The much larger FWHM score for a single linear filter at the reference location is marked by the green square. (C) Positional tuning curves covering three pixels above and below the reference position. (D) Distribution of average FWHM values for the positional tuning curves. Tuning score for the chosen filter set is again marked by a red triangle, and the tuning for a linear filter at the reference location is marked by a green square.
Figure 7
 
The set of six neighboring filters finally chosen for the local edge probability computation. (B) The on-edge (red) and off-edge (blue) likelihoods for each of the six neighboring filters when rref = 0.3. (C) Likelihood ratios (i.e., ratio of red and blue curves in B) for each filter.
Figure 7
 
The set of six neighboring filters finally chosen for the local edge probability computation. (B) The on-edge (red) and off-edge (blue) likelihoods for each of the six neighboring filters when rref = 0.3. (C) Likelihood ratios (i.e., ratio of red and blue curves in B) for each filter.
Figure 8
 
Linear response versus local edge probability. (A) Scatter plot of linear filter response versus the LEP for the image shown in C. Colored dots mark cases at the 90th (red) and 10% (blue) percentile within each of the five marked bins along the linear response axis(bin width = 0.02). (B) Image patches corresponding to marked examples in A are shown with their corresponding LEP scores. Note the much higher LEP scores, and edge probability, in top versus bottom row. (C) All image locations corresponding to the scatter plot in A with LEP scores over the 80th percentile (red line in A) were marked with red line segments in the left panel, and all locations below the 20th percentile (blue line in A) were marked by blue line segments in the right panel. Red lines are generally well aligned with object edges whereas most blue lines are misplaced or misoriented.
Figure 8
 
Linear response versus local edge probability. (A) Scatter plot of linear filter response versus the LEP for the image shown in C. Colored dots mark cases at the 90th (red) and 10% (blue) percentile within each of the five marked bins along the linear response axis(bin width = 0.02). (B) Image patches corresponding to marked examples in A are shown with their corresponding LEP scores. Note the much higher LEP scores, and edge probability, in top versus bottom row. (C) All image locations corresponding to the scatter plot in A with LEP scores over the 80th percentile (red line in A) were marked with red line segments in the left panel, and all locations below the 20th percentile (blue line in A) were marked by blue line segments in the right panel. Red lines are generally well aligned with object edges whereas most blue lines are misplaced or misoriented.
Figure 9
 
Illustration of local edge probability computation at two locations with same linear score but very different LEPs. (A) Image patches with marked reference locations. Linear filter response is same ( = 0.3) in both patches. (B) Log likelihood ratio curves, and values marked with red and blue symbols for the six neighboring filters applied to the upper and lower image patches, respectively. (C) Log likelihood ratios shown as bar heights. Resulting LEP values are shown above and below the image patches in A.
Figure 9
 
Illustration of local edge probability computation at two locations with same linear score but very different LEPs. (A) Image patches with marked reference locations. Linear filter response is same ( = 0.3) in both patches. (B) Log likelihood ratio curves, and values marked with red and blue symbols for the six neighboring filters applied to the upper and lower image patches, respectively. (C) Log likelihood ratios shown as bar heights. Resulting LEP values are shown above and below the image patches in A.
Figure 10
 
Results of applying the rm* algorithm to natural images. Maximum value of local edge probability across all orientations is shown at each pixel as the gray level. PbCanny results were generated with scale parameter of one.
Figure 10
 
Results of applying the rm* algorithm to natural images. Maximum value of local edge probability across all orientations is shown at each pixel as the gray level. PbCanny results were generated with scale parameter of one.
Table 1
 
Labeling system used to score edges at the reference location, with the corresponding interpretation and assigned edge probability.
Table 1
 
Labeling system used to score edges at the reference location, with the corresponding interpretation and assigned edge probability.
Score given Interpretation Assigned edge probability
1 Certainly no edge 0
2 Probably no edge 0.25
3 Can't tell—around 50/50 0.5
4 Probably an edge 0.75
5 Certainly an edge 1
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×