To make vision possible, the visual nervous system must represent the most informative features in the light pattern captured by the eye. Here we use Gaussian scale–space theory to derive a multiscale model for edge analysis and we test it in perceptual experiments. At all scales there are two stages of spatial filtering. An odd-symmetric, Gaussian first derivative filter provides the input to a Gaussian second derivative filter. Crucially, the output at each stage is half-wave rectified before feeding forward to the next. This creates nonlinear channels selectively responsive to one edge polarity while suppressing spurious or “phantom” edges. The two stages have properties analogous to simple and complex cells in the visual cortex. Edges are found as peaks in a scale–space response map that is the output of the second stage. The position and scale of the peak response identify the location and blur of the edge. The model predicts remarkably accurately our results on human perception of edge location and blur for a wide range of luminance profiles, including the surprising finding that blurred edges look sharper when their length is made shorter. The model enhances our understanding of early vision by integrating computational, physiological, and psychophysical approaches.

*σ,*without prior assumptions about the structure of the input, the natural (indeed unique) “window” of observation (ter Haar Romeny, 2003) is the Gaussian function

*I*(

*x*) then is effectively a stack of images resulting from Gaussian smoothing of the input image at different scales. Computations about image structure are based on the spatial derivatives of these Gaussian-blurred images. For example, the Gaussian first derivative scale–space representation of the image gradients is given by

*I*(

*x*) with a set of Gaussian derivative operators at different scales.

*n*can be treated in the same way:

*n*= 1, 2, 3…. RFs of simple cells in the visual cortex can often be well described as

*n*th order Gaussian derivative operators ( Figure 1A), where the RF has

*n*+ 1 adjacent ON and OFF subregions (Young & Lesperance, 2001). Reports of one to four or even five lobes in the simple cell RF (Hubel & Wiesel, 1962; Movshon et al., 1978; Kulikowski & Bishop, 1981; Camarda, Peterhans, & Bishop, 1985; DeAngelis, Ohzawa, & Freeman, 1993; Ringach, 2002) suggest that derivative operators up to third or fourth order could play a role in spatial vision.

*L*

_{1}across space (

*x*) and scale (

*σ*), given two Gaussian-blurred edges as input. How might the location, polarity, and blur of these edges be identified? The sign of the responses corresponds to the polarity of the edge (dark–light or light–dark), but the biggest response is always at the finest scale (1 pixel; top row of the map). This is true even for much larger blur ( Figure 2G). The degree of blur is implicit in how far the response extends through scale, and across space, but further analysis would be needed to quantify it. The edge would be explicitly identified, however, if the response pattern had a unique peak at the scale and location of the edge. Lindeberg (1998) devised a powerful, general method of “scale selection” to achieve this goal. Lindeberg explored algorithms in which the response measures for localization and for scale selection could be different. We simplified his method by supposing that peaks in a single response surface (Equation 4) should identify the position and scale of the features. The Gaussian derivative operators are multiplied by a scale-dependent gain factor

*σ*

^{ α }to give a normalized scale–space representation,

*N*

_{ n }:

*N*

_{ n }with respect to

*σ,*it is straightforward to show that, for a given class of feature such as Gaussian edge, bar, or blob, one can choose the exponent

*α*such that the response always peaks at the true location and scale of the feature (Lindeberg, 1998). For Gaussian-blurred edges (Figures 2A and 2F),

*α*=

*n*/ 2 (where

*n*is an odd integer, implying odd-symmetric RFs). Figures 2C and 2H show that with this multiscale representation of gradients,

*N*

_{1}, there are unique response peaks at the correct locations and scales for the edges in the image. These edge features have been “made explicit” (Marr, 1982). Importantly, this rendering of edge finding as a simple peak-finding problem seems to reduce, or perhaps eliminate, the need for more elaborate algorithms that combine or “track” edge information across scales (Bergholm, 1987; Zhang & Bergholm, 1997). Figure A1 (1) shows how the normalization converts monotonic response profiles across scale into peaked ones that identify the edge blur. 1 analyzes some of the scaling properties of these scale-normalized derivatives.

*Mach Bands*(Georgeson, 2006). Although the classic light and dark bands correspond to peaks and troughs in the second derivative (Ratliff, 1965; Watt & Morgan, 1985), we find that in triangle wave gratings, blurred to different extents, Mach edges are seen at peaks in the third derivative where there are no corresponding peaks in the first derivative nor zero crossings in second derivative, at any scale. This suggests that peaks in the third derivative are a cue to edge finding, but there is a crucial problem to be overcome.

*N*

_{3}appears unpromising because the additional differentiation introduces false-positives—two extra peaks or troughs in the response to each edge ( Figures 2D and 2I). To avoid this problem in machine vision, Lindeberg (Lindeberg, 1998) used the first derivative to locate edges and the third derivative to identify their scale.

*N*

_{3}scheme solves the multiple peaks problem and makes accurate predictions about perceived edge location and blur. The necessary modification is to split the differentiation into two stages and to transmit only the positive parts of the response at each stage (Figure 1B). We denote the half-wave rectified first stage output as

*R*

_{1}

^{+}and the normalized second stage output as

*N*

_{3}

^{+}:

*σ*= √(

*σ*

_{1}

^{2}+

*σ*

_{2}

^{2}) and

*σ*

_{1}=

*σ*/4. The choice of

*σ*

_{1}=

*σ*/4 was an empirical one, but we show later that the choice is not critical provided 0 <

*σ*

_{1}<

*σ*/2; hence, we chose a fixed value (

*σ*/4) in the middle of that range. It was not adjusted to fit individual experiments. The sign reversal introduced in Equation 6 simply ensures a positive output for edges of positive gradient. This scheme creates a nonlinear filter or “channel” (Figure 1B) that is responsive to edges of only one polarity. The first rectifier vetoes regions of negative gradient, and the second rectifier eliminates the negative troughs of response flanking an edge with positive gradient. Thus, the response

*N*

_{3}

^{+}(Figures 2E and 2J) is a simple unimodal surface whose peak location and scale correctly identify the position and blur of a dark–light edge, but with no response to a light–dark edge. The problem of identifying the edge is reduced to simple peak finding, with no false-positives.

*N*

_{3}

^{+}response surface (e.g., Figure 2J) is much more compact than the

*N*

_{1}surface ( Figure 2H), implying better resolution for

*N*

_{3}

^{+}in both space and scale. Indeed, we found that

*N*

_{1}failed to resolve separate peak responses when two edges of the same polarity and blur

*b*were separated by less than 4.8

*b,*but

*N*

_{3}

^{+}resolved the two edges down to a separation of only 2.5

*b*( Figure S1).

*N*

_{3}

^{−}, is simply obtained by reversing the sign of the first filter:

*b*

_{0}was adjusted by computer over a series of trials, using a standard, one-up, one-down double interleaved staircase procedure to find the comparison blur that appeared to match the test edge blur, estimated as the average of the last 8 or 10 reversal points in the staircase run. The edges were always vertical. Their contrast polarity randomly varied from trial to trial. If the test image contained multiple edges (e.g., a sine wave grating), the observer judged the edge that was in the centre of the screen. Screen luminance was calibrated with a digital photometer. Lookup tables in the graphics card were used to linearize (gamma correct) the display and control the contrast of the image. Further details of procedure are given in Table 1, with illustrations in the Supplementary methods.

Experiment no. | 1, 2 ( Figure 3) | 3, 4, 5 ( Figures 4, 5, and 6) |
---|---|---|

Computer graphics | Macintosh G4 | Windows PC + VSG card |

Software | NIH Image | Pascal + VSG |

Grayscale monitor | Eizo 6600 | Eizo 6500 |

Frame rate | 75 Hz | 60 Hz (90 Hz in Experiment 5) |

Mean luminance | 34 cd/m ^{2} | 75 cd/m ^{2} |

Test image size | 256 × 256 pixels | 512 × 512 pixels |

Test image window | 4.3° × 4.3° square with sharp edges | 5° × 5° circle, with smoothed edges |

Gray background field | 17.2° (W) × 12.9° (H) | 6.1° diameter disk |

Test duration | 300 ms | 230 ms |

Interstimulus interval | 300 ms | 580 ms |

Interval order | Random order | Test first, comparison second |

Polarity of test and comparison edges | Same polarity within trials; varied between trials | Same polarity within trials; varied between trials |

Staircase rule | one-up, one-down | one-up, one-down |

Staircase step size after first two reversals | 1 dB (0.05 log unit) | 1 dB |

Order of trials within sessions | 2 staircases per condition. All conditions randomly interleaved | 2 staircases per condition. Conditions run in randomly ordered blocks of 20 trials, until all staircases completed. |

No. of reversals used to estimate a match | 8 | 10 |

No. of matches per subject per condition | 4 | 4–6 |

*I*

_{mix}consisted of two Gaussian edges superimposed at the same location, with the same polarity, but different blurs (

*b*

_{1},

*b*

_{2}):

*I*

_{0}is the fixed mean luminance,

*x*

_{0}is the position of the edge, Φ(

*x,*

*b*) is the integral of the unit-area Gaussian

*G*(

*x,*

*b*) with space constant (blur)

*b,*and

*c*

_{1}and

*c*

_{2}are the contrasts of the two component edges. Both edges were at the centre of the screen (

*x*

_{0}= 0), and overall contrast

*c*

_{1}+

*c*

_{2}was constant (0.3). The ratio of contrasts of the two component edges

*r*=

*c*

_{1}/

*c*

_{2}was 0.1, 1, 3, 10, 30, or 100 and the component blurs (

*b*

_{1},

*b*

_{2}) were (15, 5) or (30, 10) arcmin. Test intervals were 300 ms, separated by a gray (mean luminance) interval of 300 ms. The comparison image in all experiments was a Gaussian edge of contrast

*c*= 0.3, whose blur

*b*

_{0}was varied by the staircase routine:

*b*= 10, 20, or 30 arcmin. The local contrast function

*C*(

*x*) = [

*I*(

*x*) −

*I*

_{0}]/

*I*

_{0}of the original Gaussian edge was defined by

*C*(

*x*) = [2Φ(

*x,*

*b*) − 1]. The waveform

*C*(

*x*) (range −1 to +1) was passed through a Naka–Rushton nonlinear transformation

*C*/(s + ∣

*C*∣), and after appropriate scaling of amplitude the luminance profile of the modified edge was

*s*were 0.1, 0.3, 1, 3, 10, 100, and 1000. The maximum gradient increased as

*s*decreased, but Michelson contrast (

*c*= 0.3) was held constant.

*x*-direction with a flat profile in the

*y*-direction at three scales (5.7, 11.3, and 22.6 arcmin). Subjects matched the blur of the central edge. Order −1 is a baseline condition where test and comparison edges are both Gaussian integrals. Michelson contrast was 0.32, duration 230 ms.

*b*

_{1},

*b*

_{2}) and contrasts (

*c*

_{1},

*c*

_{2}) are added together in the same place, how blurred does the resulting edge appear to be? Overall contrast was constant (

*c*

_{1}+

*c*

_{2}= 0.3), whereas the contrast ratio (

*r*=

*c*

_{1}/

*c*

_{2}) was varied to determine what contribution the two component edges made to the mixture.

*b*

_{0}as the mixture ranged from mainly the smaller blur (

*b*

_{2}) through to mainly the larger blur (

*b*

_{1}). Not surprisingly, as more of the large blur entered the mixture, the perceived (matched) blur increased from

*b*

_{2}to

*b*

_{1}( Figure 3A: 5–15 arcmin; or Figure 3B: 10–30 arcmin). The interest lies in the manner of this transition. Firstly, vision does not simply average the blurs. Matched blur was not well predicted by a contrast-weighted average of the two component blurs (

*c*

_{1}

*b*

_{1}+

*c*

_{2}

*b*

_{2})/(

*c*

_{1}+

*c*

_{2}) ( Figures 3A and 3B, black dash-dot curve). When the component edges had equal contrast (

*r*= 1), the matched blur remained almost equal to the smaller blur value. By interpolation, the contrast of the more blurred edge had to be about five times greater than the less blurred edge (around

*r*= 5) before the matched blur reached the average of the component blurs. Blur averaging then does not explain the perception of blur mixtures. (A referee asked whether compressive transformation of contrast before averaging might improve the fit. To test this, we applied a power function [exponent

*p*] to the contrast values before calculating the weighted average blur. For

*p*around 0.6, there was a small [9%] improvement in RMS error, but the fit to the data remained poor.)

*N*

_{1}model (thin red curve) fell much closer to the experimental data.

*N*

_{3}or

*N*

_{3}

^{+}. (The two versions make identical predictions for these experiments, in which the luminance gradients across a given test edge are ≥0 everywhere [or, with reversed polarity, ≤0 everywhere].) The fit was good for one observer (K.A.M.) and excellent for the other (M.A.G.). There are no free parameters here, and nothing was adjusted to achieve this fit ( Figures 3A and 3B, thick red curve). Taken over the 12 conditions and 2 observers, the RMS error between model and data ( Figure 3C) was 4.2 arcmin for blur averaging, 3.6 for luminance template, 2.1 for

*N*

_{1}, but only 1.3 for

*N*

_{3}

^{+}. In summary, blur matching in this experiment was fairly well predicted by the multiscale first derivative

*N*

_{1}model but very well predicted by the multiscale third derivative

*N*

_{3}or

*N*

_{3}

^{+}.

*s*) decreased, the waveform was no longer Gaussian but had a steeper gradient and appeared sharper. Most importantly, the blur-matching values were well predicted by the

*N*

_{3}model ( Figures 3D, 3E, and 3F). The pattern of deviation for the other models was fairly similar to the blur mixture experiment. RMS error between model and data over 2 observers and 21 test conditions was 3.9 arcmin for luminance template, 2.2 for

*N*

_{1}, but only 1.1 for

*N*

_{3}(Blur averaging was not definable in this case).

*N*

_{3}model.

*N*

_{3}

^{+}model plays no role when all gradients are positive. Hence, we further challenged that model and tested it for the presence of the first rectifier by introducing multiple edges with positive and negative gradients. For an existing data set (Georgeson, 1994), the test pattern was either a full sine wave grating (Figure 4C), a single period of a sine wave (Figure 4B), or a half-period sine wave edge (Figure 4A). These three test images were identical within the central half-period (the test edge) but different outside that region. They also have very different Fourier spectra—low-pass (single edge) versus narrowband (grating). Matched Gaussian blur was found to be directly proportional to the sine wave period (Figure 4) and was the same for all three test types. Thus, adding sine wave edges adjacent to the central test edge narrowed the Fourier spectrum and increased the total contrast energy but had no effect on perceived blur. Of the three multiscale models (

*N*

_{1},

*N*

_{3}, and

*N*

_{3}

^{+}), only

*N*

_{3}

^{+}predicted the data accurately in all three cases (Figures 4A, 4B, and 4C). The linear

*N*

_{1}model underestimated the blur matches, whereas the linear

*N*

_{3}model seriously overestimated them (Figures 4B and 4C).

*N*

_{3}

^{+}succeeds here because the first rectifier segments the image into regions of positive gradient, separated by zero-valued regions (where the gradient is zero or negative, vetoed by the rectifier). Responses in the neighborhood of an edge are then the same whether the image is periodic (a grating) or aperiodic (an isolated edge), leading to the same blur code in each case (see scale–space maps in Figure S2).

*N*

_{3}

^{+}model was an accurate and clear winner ( Figure 5).

*f*+ 3

*f*) grating. The

*N*

_{3}

^{+}model made very accurate predictions for both experiments ( Figure S3). The linear

*N*

_{1}model was fairly accurate for these two experiments, but the linear

*N*

_{3}model was poor for the

*f*+ 3

*f*experiment. As in the sine wave experiment ( Figure 4), it greatly overestimated the blur.

*N*

_{3}

^{+}was remarkably accurate in predicting the absolute values of blur matches, across a diverse set of conditions. Such unusual precision suggests that the model correctly captures some of the key processes in early visual coding of blur.

*x*-direction) was unchanged. As with our one-dimensional results, blur matching was scale invariant: The effects of edge length were nearly the same at all four test blurs ( Figure 6) when the test length and the resulting blur match were expressed as a proportion of test blur. For an eightfold range of test blurs (

*b*), the perceived blur started to decrease as test length fell below about 6

*b,*and edges looked about 50% less blurred when the length equalled

*b*.

*N*

_{3}

^{+}(or

*N*

_{3}) model with filter kernels that were derivatives of an isotropic (circular) Gaussian ( Figures 1A and 1B). The fit to the data was good. The

*N*

_{1}model with the same isotropic assumption greatly overestimated the degree of sharpening with length reduction, although this would be improved by (perhaps implausibly) assuming an even shorter length–width ratio for the

*N*

_{1}kernels.

*N*

_{1},

*N*

_{3}

^{+}) stood out from the others that we considered, and of these two, the nonlinear third derivative model,

*N*

_{3}

^{+}, was consistently the better predictor of blur matching.

*f*+ 3

*f*) gratings and (5) sine wave gratings, as well as (6) the equivalence of blur in periodic and single sine wave edges, and (7) the striking finding that shorter edges look sharper. This entire raft of findings was predicted by a scale–space model that has a very specific processing architecture ( Figure 1B), and only one parameter was chosen to fit the data.

*N*

_{3}

^{+}that make it a viable edge finder whereas the linear model

*N*

_{3}is not. There are two filter stages, each followed by a half-wave rectifier. The first filter

*R*

_{1}computes local gradients, at multiple scales. For a single edge ( Figure 7A), the rectified output

*R*

_{1}

^{+}is the same as

*R*

_{1}( Figure 7B)—a ridge in scale–space at the edge location. After differentiating twice more, inverting the sign, and applying the scale normalization, the linear response

*N*

_{3}has a characteristic signature in scale–space—a central peak with two “wings” of opposite polarity ( Figure 7C; also Figures 2D and 2I). Using a naive peak/trough rule, these wings could be falsely taken as additional negative-going edges. The first rectifier, however, guarantees that only positive peaks in the second-filter output correspond to edge locations. It does so because

*R*

_{1}

^{+}vetoes negative gradients. This veto ensures that negative parts of the second-filter response could not possibly arise from negative-going edges: They are already suppressed at Stage 1. Thus, the second rectifier can be routinely applied to exclude the negative wings and isolate the edge response at the correct location and scale ( Figure 7D). Without the first rectifier, the negative troughs could not be safely excluded as candidate edges. In this way, the nonlinear structure of

*N*

_{3}

^{+}overcomes the “multiple-peaks problem” that occurs with narrowband linear spatial filtering (

*N*

_{3}).

*N*

_{3}response to one edge interfere with the main response peak for the other edge ( Figure 7G). There is some distortion of peak position, and the peak scale (blur code) is shifted by as much as half an octave. Thus, the

*N*

_{3}model overestimated blur for all our experiments with periodic edges, but the human observers did not. The nonlinear channel neatly eliminates this interference: By excluding the negative gradients at Stage 1, the interfering “wings” are removed from Stage 2, and the response to the preferred edge ( Figure 7H) is almost identical to that for an isolated edge ( Figure 7D), with no distortion of scale or position. In short, the first rectifier plays two key roles: (1) It eliminates the irrelevant peaks that would be introduced by successive differentiation, and (2) it eliminates interference between neighboring edges.

*N*

_{3}

^{+}channel output, representing a positive-going edge, requires the gradient at that point to be positive while the third derivative is negative. The “wings” in the

*N*

_{3}response (Figure 7C) do not satisfy this sign constraint and are rejected by

*N*

_{3}

^{+}. Thus, the

*N*

_{3}

^{+}model does not explicitly identify ZCs, but it has much in common with models that do.

*N*

_{3}

^{+},

*N*

_{3}

^{−}) were compared with data in which observers marked with a cursor the locations of perceived edges in a family of “phase-coherent” one-dimensional test images (see Supplementary methods for details).

*N*

_{3}

^{+}model also does so and that it is well-behaved with no misses and no false-positives. The role of the rectifiers (discussed above) in suppressing “phantom” edges that would otherwise be introduced by the narrowband filtering (Clark, 1989) appears to be robust in this wide-ranging test.

*N*

_{3}) linear filtering, without harmonic distortion, that does introduce spurious features (e.g., Figures 2I and 7G). This may seem paradoxical but begins to make more sense when we view the goal of these nonlinear channels as feature analysis, not frequency analysis.

*N*

_{3}

^{+}channel might look like to a physiologist, and how they might correspond with those in V1. The first stage (

*R*

_{1}

^{+}) is much like a simple cell with adjacent ON and OFF subregions. Like the standard model for simple cells, it is a linear spatial filter followed by half-wave rectification ( Figure 1B). (We omit here the complications introduced by divisive gain controls in LGN (lateral geniculate nucleus) and cortex. This may be a reasonable simplification for our experiments where contrast was fixed at about 30% throughout. If the gain of the mechanisms is set by contrast, then at fixed contrast the system will be quasi-linear, which is what our model assumes.)

*σ*

_{1}/

*σ*) of the first stage. When

*σ*

_{1}is relatively large ( Figure 9D), the RF has nonoverlapping ON and OFF subregions separated by a small gap, much like a simple cell (cf. Figure 2A of Kagan, Gur, & Snodderly, 2002). When

*σ*

_{1}=

*σ*

_{2}, the subregions abut (Figure 9C)—again like a simple cell. But when

*σ*

_{1 is}relatively small (Figures 9A and 9B), the ON and OFF regions overlap considerably—a characteristic of complex cells (Hubel & Wiesel, 1962; Kagan et al., 2002; Mata & Ringach, 2005; Martinez et al., 2005). In this model, the separation or overlap of ON and OFF subregions is controlled by the size of the first filter.

*σ*

_{1}/

*σ*= 0.25. To see whether this was critical, we re-ran the model for several experiments with

*σ*

_{1}/

*σ*ranging from 0.1 to 0.9. As expected, when the input had a single sign of gradient (an isolated positive edge),

*σ*

_{1}/

*σ*was immaterial because the first rectifier then has no influence on the channel output. On the other hand, for periodic waveforms predicted blur remained close to the data for

*σ*

_{1}/

*σ*up to 0.5 but was increasingly overestimated (by up to a factor of 2) as

*σ*

_{1}/

*σ*increased from 0.5 to 0.9. We conclude that the success of

*N*

_{3}

^{+}in predicting perceived blur does require the first filter to be small compared with the second filter, and that this is associated with the complex-like RF of Figure 9B. The sequence of two nonlinear stages (filter–rectify–filter–rectify) is essential to its correct behavior in edge coding and has some parallel in recent physiological findings that simple cells in layer 4 of the cat cortex are prior to, and provide the input for, complex cells in layers 2 and 3 (Martinez & Alonso, 2001).

*N*

_{3}

^{+}mechanism ( Figures 1B and 9B) was very similar to the band-pass tuning of the linear, Gaussian third derivative (

*N*

_{3}) filter. The only difference was a slight broadening of responses on the low frequency side of the peak (not shown). The spatial waveform of the response to a sine grating was fairly similar to a half-wave rectified sine wave, showing high response modulation (

*F*1/

*F*0∼ = 1.6) that is thought to be more characteristic of simple cells than complex cells. We note, however, that the degree of modulation varies widely across cells, and that many cells classed as complex in the awake monkey were found to have high response modulation for drifting gratings (Kagan et al., 2002).

*N*

_{3}

^{+}channel would respond to second-order structure as well as to luminance edges, but we found its computed response to contrast modulation to be weak whereas its response to the high frequency carrier was strong. There are two aspects of FRF channel design that can promote a strong response to second-order modulation while suppressing responses to the carrier. These are (1) full-wave rectification after the first filter and (2) little overlap in the orientation and/or spatial frequency tuning of the first and second filters (Bergen & Landy, 1991; Chubb & Sperling, 1988; Dakin & Mareschal, 2000; Wilson, Ferrera, & Yo, 1992). The

*N*

_{3}

^{+}channel, on the other hand, has half-wave rectification and considerable overlap in the filter tunings. These give it the interesting edge-coding properties discussed here but make it ill suited to second-order signal processing. Nonlinear FRF “sandwich” mechanisms can evidently be exploited for different purposes in first and second order vision, depending on the details of the FRF structure.

*f*spectrum, all filters carry the same information load—that is, all filters have the same expected variance in their outputs over space and time, and this allows information to be coded by neurons that have the same limited dynamic range in their responses.

*f*spectrum, the output energy across filter scales will be constant. Thus, an image might be judged as in-focus when responses (aggregated across space) are equal across scale, but judged as blurred when responses decline at the smaller scales. To cope with the fact that spatial structure is sparsely distributed in some images, but dense in others, they introduced a nonlinear thresholding scheme—the rectified contrast spectrum (RCS)—in which the variance of each channel output was computed not over the whole image, but only over regions containing significant structure, where local responses exceeded a threshold (

*s*/2, where

*s*is the standard deviation of responses over the whole image). Finally, the slope of the RCS (on log–log axes) was taken as an index of image blur.

*α*=

*n*in Equation 4 produces an equal-amplitude filtering scheme of the required kind (see 2), enabling us to compare our blur code (where

*α*=

*n*/2) with that proposed by Field and Brady (1997). Figures 10A and 10B show the

*N*

_{3}

^{+}responses across filter scales computed for a sine grating, a single (half-period) sine edge, and a Gaussian edge. Experiment 3 (Figure 4) found that Gaussian blur

*b*would, on average, match a sine wave edge of period

*p*when the ratio

*b*/

*p*was 0.14. This

*b*/

*p*ratio was therefore used in Figure 10 so that all three types of edge would have the same perceived blur. Figures 10A and 10B illustrate two key properties of

*N*

_{3}

^{+}: that peaks of response exist for both periodic and aperiodic edges, and that they occur at the same filter scale when the waveforms are perceptually matched in blur. In short, peak scale predicts edge blur.

*s*/2 threshold (defined above). RCSs for the two single edges are similar, but they have little in common with the RCS for a grating. The grating response is scale tuned, whereas the single-edge response increases monotonically with filter scale. Similar divergence between the RCSs for periodic and aperiodic edges was seen for even-symmetric RFs (

*n*= 2) and odd ones (

*n*= 1 or 3) and at all threshold levels. These properties reflect the behavior of the underlying linear filters, seen in Figure A2. All the RCSs are curvilinear functions of scale, and so it is not easy to see any simple measure—such as slope—that would encode the blur or capture the equivalence of perceived blur between gratings and single edges. This contrasts with the success of the RCS in representing changes in spectral slope of natural images, textures, or two-dimensional noise via a single RCS slope measure. Indeed, Field and Brady (1997) anticipated that the RCS approach would have difficulty in encoding edge blur, partly because “altering the slope of the spectrum is not a good model of optical blur” (p. 3382) and partly because the RCS is still a global measure, whereas “a more accurate measure of blur will certainly involve local measures and will probably be best calculated on an edge by edge basis” (p. 3381). We agree with both points, and we propose peak finding in the

*N*

_{3}

^{+}scale–space as an effective and empirically supported algorithm for edge finding and local blur coding. It remains to be seen whether the sense of global blur obtained from an optically blurred image can be understood as some simple aggregate of these local blur measures (Dijk, van Ginkel, van Asselt, van Vliet, & Verbeek, 2003).

*σ*

^{−n/2}would produce an output whose gains matched those needed for

*N*

_{3}

^{+}. In short, the benefits of Field and Brady's scheme, and those of

*N*

_{3}

^{+}, could coexist at successive stages of processing.

*α*=

*n*/2 in Equation 4), the problem of locating edges and determining their blur reduces to finding peaks in the scale–space map of responses. Predictions from a linear model (

*N*

_{1}) based on Gaussian first derivative filters were in fair agreement with our blur-matching data, but the nonlinear third derivative model was consistently more accurate. Each channel in the

*N*

_{3}

^{+}model has a two-stage structure analogous to the sequence from simple to complex cells in visual cortex, and the half-wave rectifying nonlinearities play a crucial role in enabling edge finding without false-positives. The

*N*

_{3}

^{+}model draws together three lines of thought about vision—computational, physiological, and psychophysical. It implements a principled, scale–space approach to the representation of key features in early vision and does so via physiologically plausible mechanisms, supported by some strikingly accurate predictions about human perception.

*N*

_{3}

^{+}(or

*N*

_{3}) model than the N1 model. This enables the

*N*

_{3}

^{+}model to resolve pairs of closely spaced edges better. Here the 2 edges had the same polarity, amplitude and blur (

*β*= 8 pixels), with spatial separation of 5

*β*, 3

*β*or 2

*β*, as shown. The

*N*

_{1}model failed to resolve two peaks for edges separated by less than 4.8

*β*, while

*N*

_{3}

^{+}could resolve them down to 2.5

*β*. This was true for all blurs

*β*. The psychophysical limit for this task has yet to be tested.

*N*

_{1}model, middle row) and the positive and negative output channels (

*N*

_{3}

^{+},

*N*

_{3}-) (bottom row). Segregation of positive and negative edge responses in the

*N*

_{3}

^{+/-}model prevents edge blur coding from being influenced by neighbouring edges.

*n*th harmonic, where

*n*= 1, 3, 5 ... 15. Subjects judged the blur of its central edge against a single, Gaussian comparison edge, using the two-interval procedure described in the text. Fundamental frequency

*f*was 0.35 c/deg; fundamental contrast = 0.32. Not surprisingly, as n increased, the closer approximations to a step edge looked sharper (see insets), with excellent quantitative agreement between observers (M.A.G., T.C.A.F.). With no free parameters, all 3 models captured this trend fairly well, but predictions from

*N*

_{3}

^{+}were more accurate than the linear

*N*

_{1}or

*N*

_{3}models. (B) Experiment similar to A, but the test grating contained only

*f*and 3

*f*components, where

*f*= 0.33 or 1 c/deg (upper and lower datasets). For contrast ratios 1, 2, 4, 8, 16, 32, the pairs of (

*f*, 3

*f*) contrasts were (8, 8), (11.3, 5.7), (16, 4), (22.6, 2.8), (32, 2), (45.2, 1.4). As the

*f*component contrast increased (relative to 3

*f*), the central edge looked increasingly blurred. Both models

*N*

_{1}and

*N*

_{3}

^{+}predicted the results fairly accurately, but without the rectifier the N3 model failed badly at relatively low 3

*f*contrasts, where contrast ratio > 8. As with pure sine-waves, the rectifier plays a key role in isolating adjacent edges from each other when blur is large relative to separation.

*β*. The peak shift occurs for both models

*N*

_{1}and

*N*

_{3}

^{+}(or

*N*

_{3}), but

*N*

_{1}over-estimated the experimentally observed shift, while

*N*

_{3}

^{+}was fairly accurate. Here it was assumed that all filter kernels were partial derivatives of a circular Gaussian function.

*α*on the scale-space response to an edge. Columns (left to right) illustrate

*α*= 0,

*n*/4,

*n*/2 and

*n*, where

*n*is the derivative order of the filter or channel. Normalization scales the final filter output by the factor

*σ*

^{ α }, where

*σ*is the scale of the filter. Hence larger values of a progressively amplify the large-scale filters relative to the smaller ones. This shifts peak responses to the larger scales, but eventually leads (when

*α*=

*n*, 4th column) to a low-pass activity profile rather than a peaked one. We used

*α*=

*n*/2, to ensure that the peak scale matched the edge blur. The filtering scheme proposed by Field (1987, Field & Brady, 1997)—in which all channels have the same peak sensitivity to their preferred spatial frequency—is represented here by

*α*=

*n*. It exhibits contrast constancy, but does not allow blur coding by peak-finding.

*I*(

*x*;

*b*) of blur

*b*and unit amplitude, where luminance

*I*is the indefinite integral of the unit-area Gaussian

*G*(

*x*;

*b*).

*I*is simply

*G*(

*x*;

*b*), and because variances add under convolution, Equation A1 reduces to

*s*= √(

*σ*

^{2}+

*b*

^{2})

*.*When

*n*= 1,

*N*

_{1}the spatial profile of response to a Gaussian edge, at any filter scale

*σ,*is a Gaussian whose spread increases with

*σ*. These profiles peak at the edge location (

*x*= 0). Response values at

*x*= 0 are plotted across filter scales in Figure A1 (thin curves). As we saw in Figure 1, without scale normalization, responses are greatest at the smallest filter scale, but with the chosen normalization (

*α*=

*n*/2), responses show a peak where

*σ*=

*b*. Responses to different edge blurs have a common asymptote at large filter scales. This means, as one might expect, that large-scale filters cannot distinguish between sharp edges and blurred ones.

*N*

_{3}, which (from Equation A2, with

*n*= 3) becomes

*x*= 0, this simplifies to

*shape*of the response curves (on log–log axes) is the same for all blurs. From Equations A3 and A4, for

*n*= 1, 3 and for any

*α,*response magnitude over scale at

*x*= 0 is a function of relative scale,

*σ*/

*b*:

*b*

^{ α }

^{− n }.

*σ,*that when

*α*=

*n*/2, the peak response occurs at scale

*σ*

_{max}=

*b,*as shown in Figure A1, right. This scaling property means that the scale

*σ*

_{max}of the most active filter identifies the edge blur

*b*. This of course is the main goal of our model.

*n*= 1, 3 and

*α*=

*n*/2, with input edge amplitude

*c*(0 ≤

*c*≤ 1), the peak response values are proportional to

*c,*but also vary with blur

*b*:

*α*=

*n*/2, peak response amplitude falls as

*b*

^{− α }( Figure A1, right). Thus, there is no “contrast constancy” in the output of these filter sets. Nevertheless, these filters do encode edge contrast. Once the edge location and blur have been found, edge amplitude

*c*can be recovered from the peak response value

*R*

_{max}and its corresponding scale

*σ*

_{max}by inserting these two values into Equation A7 or A8 and solving for

*c*. It follows that for all edge blurs, the quantity

*R*

_{max}(

*σ*

_{max})

^{ n/2}is directly proportional to contrast. We examine these contrast-coding ideas more closely elsewhere (May & Georgeson, 2007).

*N*

_{3}also apply to the nonlinear mechanism

*N*

_{3}

^{+}because for single edges (and any other input with nonnegative gradients), the first rectifier is immaterial and the behaviors of

*N*

_{3}and

*N*

_{3}

^{+}are identical.

*α*=

*n*

*α*<

*n*. An interesting consequence of Equation A6 is that when

*α*=

*n*(or

*α*= 0), the edge response amplitude no longer has a peak that can identify edge blur. See Figure S5 for scale–space maps illustrating this. When

*α*=

*n,*the responses have no peak, but instead a common horizontal asymptote at large filter scales ( Figure A2, left).

*α*=

*n*( Figure A2, right). We prove that result here for the case of linear filters, and we note that it is also true for the nonlinear

*N*

_{3}

^{+}channel. The response amplitude of the normalized

*n*th Gaussian derivative operator of scale

*σ*to a sine wave grating of spatial frequency

*f*is easily obtained from the Fourier transform,

*F*:

*p*= 1/

*f,*so that when

*α*=

*n,*

*σ*/

*p*).

*α*<

*n*. We used

*α*=

*n*/2, which (uniquely) renders the response curve for Gaussian edges symmetrical about the peak, on a log scale axis ( Figure A1, right). As

*α*deviates from

*n*/2, the response curves become more asymmetrical, and the correct (experimentally observed) equivalence between sine and Gaussian edges is gradually lost (not shown). Hence, there is a strong case for the specific model in which

*α*=

*n*/2

*.*

*b*

^{ α− n }(see 1). When

*α*=

*n,*contrast constancy is obtained directly, as in the Brady and Field (1995) filtering scheme. When

*α*<

*n,*contrast constancy can be restored after peak finding simply through rescaling by the known factor

*b*

^{ n }

^{−α }as discussed above (1) and illustrated by the filled symbols in Figures 10A and 10B.

*N**

_{3}, defined as

*N*

_{3}but with

*α*=

*n*). They proposed that the decline in contrast sensitivity at high SFs follows from the fact that the high frequency filters have a broader frequency bandwidth (in linear terms) and would collect more input noise when, as seems likely, the input noise has a fairly flat spectrum. Thus, at this level, the smaller scale filters have a poorer signal–noise ratio. A further processing step, rescaling all filter responses by

*σ*

^{−n/2}, would convert

*N**

_{3}to behave exactly as

*N*

_{3}with

*α*=

*n*/2. But if this final linear step—progressively attenuating the larger scale filters—adds no further noise, then signal–noise ratios in each filter remain unchanged, and the smaller scale filters still have the poorer signal–noise ratio. Despite their (now) low amplitude of response, the larger scale filters would continue to have the better signal–noise ratio.

*N*

_{3}or

*N*

_{3}

^{+}cannot easily be rejected by signal–noise ratio arguments. We also see that Brady and Field's (1995) filtering scheme (like

*N**

_{3}), in which all filters have equal peak-response amplitude, could be easily transformed into

*N*

_{3}

^{+}to find edges and encode blur, provided that successive stages of processing are considered.

*Computational models of visual processing*. Cambridge, MA: MIT Press.

*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 9, 726–741. [CrossRef] [PubMed]

*The Journal of Physiology*, 203, 237–260. [PubMed] [Article] [CrossRef] [PubMed]

*Vision Research*, 35, 739–756. [PubMed] [CrossRef] [PubMed]

*Visual perception: Physiology, psychology and ecology*. Hove & New York: Psychology Press.

*Higher order processing in the visual system*. (pp. 129–141). Chichester: Wiley. [CrossRef]

*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 8, 679–698. [CrossRef] [PubMed]

*Journal of the Optical Society of America A, Optics and image science*, 5, 1986–2007. [PubMed] [CrossRef] [PubMed]

*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 11, 43–57. [CrossRef]

*Vision Research*, 40, 311–329. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 22, 545–559. [PubMed] [CrossRef] [PubMed]

*Spatial Vision*. (pp. 1–381). Oxford: Oxford University Press.

*Lecture Notes in Computer Science*, 2756, 149–156. [CrossRef]

*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 20, 699–716. [CrossRef]

*Journal of the Optical Society of America A, Optics and image science*, 4, 2379–2394. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 37, 3367–3383. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 41, 711–724. [PubMed] [CrossRef] [PubMed]

*Proceedings of the Royal Society B: Biological Sciences*, 249, 235–245. [PubMed] [CrossRef]

*Higher order processing in the visual system*. (pp. 147–165). Chichester: Wiley. [CrossRef]

*Image and Vision Computing*, 16, 389–405. [CrossRef]

*Journal of Vision*, 6, (6):191, [CrossRef]

*Vision Research*, 37, 127–142. [PubMed] [CrossRef] [PubMed]

*. The Journal of Physiology*, 252, 627–656. [PubMed] [Article] [CrossRef] [PubMed]

*Treatise on physiological optics*. –482). Bristol: Thoemmes Press (Original work published 1856).

*Vision Research*, 45, 507–525. [PubMed] [CrossRef] [PubMed]

*The Journal of Physiology*, 160, 106–154. [PubMed] [Article] [CrossRef] [PubMed]

*Journal of Neurophysiology*, 88, 2557–2574. [PubMed] [Article] [CrossRef] [PubMed]

*Journal of the Optical Society of America A, Optics and image science*, 2, 1170–1190. [PubMed] [CrossRef] [PubMed]

*Biological Cybernetics*, 50, 363–370. [PubMed] [CrossRef] [PubMed]

*Proceedings of the Institute of Radio Engineers*, 43, 560–570.

*Psychological Research*, 64, 136–148. [PubMed] [CrossRef] [PubMed]

*Biological Cybernetics*, 43, 187–198. [PubMed] [CrossRef] [PubMed]

*International Journal of Computer Vision*, 30, 117–154. [CrossRef]

*Vision Research*, 35, 2697–2722. [PubMed] [CrossRef] [PubMed]

*Mach bands*. (pp. 253–271).

*Vision*. New York: Freeman.

*Proceedings of the Royal Society of London B: Biological Sciences*, 207, 187–217. [PubMed] [CrossRef]

*Neuron*, 32, 515–525. [PubMed] [Article] [CrossRef] [PubMed]

*Nature Neuroscience*, 8, 372–379. [PubMed] [Article] [CrossRef] [PubMed]

*Journal of Neurophysiology*, 93, 919–928. [PubMed] [Article] [CrossRef] [PubMed]

*Vision Research*, 47, 1705–1720. [PubMed] [CrossRef] [PubMed]

*Proceedings of the Royal Society of London B: Biological Sciences*, 235, 221–245. [PubMed] [CrossRef]

*Pattern Recognition Letters*, 16, 667–677. [CrossRef]

*Pattern Recognition Letters*, 6, 303–313. [CrossRef]

*The Journal of Physiology*, 283, 53–77. [PubMed] [Article] [CrossRef] [PubMed]

*Nature*, 381, 607–609. [PubMed] [CrossRef] [PubMed]

*Proceedings of the IEEE*, 90, 78–93. [CrossRef]

*Mach bands*. (pp. 1–365). San Francisco: Holden-Day.

*Trends in Neurosciences*, 15, 86–92. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 39, 2697–2716. [PubMed] [CrossRef] [PubMed]

*Front-end vision and multi-scale image analysis*. (pp. 1–464). Dordrecht: Kluwer. [CrossRef]

*International Journal of Pattern Recognition and Artificial Intelligence*, 14, 757–777.

*Proceedings of the Royal Society B: Biological Sciences*, 265, 359–366. [PubMed] [Article] [CrossRef]

*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 11, 973–977. [CrossRef]

*Visual processing*. (pp. 1–168). Hove and London: Erlbaum.

*Vision Research*, 23, 1465–1477. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 25, 1661–1674. [PubMed] [CrossRef] [PubMed]

*Physical and biological processing of images*. (pp. 88–99). New York: Springer-Verlag. [CrossRef]

*Visual Neuroscience*, 9, 79–97. [PubMed] [CrossRef] [PubMed]

*Visual Neuroscience*, 11, 1205–1220. [PubMed] [CrossRef] [PubMed]

*Proceedings of the 8th International Joint Conference on Artificial Intelligence,*1019–1022.

*Spatial Vision*, 14, 321–389. [PubMed] [CrossRef] [PubMed]

*International Journal of Computer Vision*, 24, 219–250. [CrossRef]