**Itis well-known that “smooth” chains of oriented elements—contours—are more easily detected amid background noise than more undulating (i.e., “less smooth”) chains. Here, we develop a Bayesian framework for contour detection and show that it predicts that contour detection performance should decrease with the contour's complexity, quantified as the description length (DL; i.e., the negative logarithm of probability integrated along the contour). We tested this prediction in two experiments in which subjects were asked to detect simple open contours amid pixel noise. In Experiment 1, we demonstrate a consistent decline in performance with increasingly complex contours, as predicted by the Bayesian model. In Experiment 2, we confirmed that this effect is due to integrated complexity along the contour, and does not seem to depend on local stretches of linear structure. The results corroborate the probabilistic model of contours, and show how contour detection can be understood as a special case of a more general process—the identification of organized patterns in the environment.**

*α*, between consecutive elements are large, an effect widely interpreted as implicating lateral connections between oriented receptive fields in visual cortex, known as the

*association field*. This preference for small turning angles is sometimes expressed as a preference for smooth curves because turning angle can be thought of as a discretization of the curvature of an implied underlying contour.

^{1}The association field has proven a useful construct in understanding contour integration, but remains qualitative in that it suggests a simple dichotomy between contours that are grouped and those that are not (in its original form, the model assumes that neighboring elements are integrated if they are collinear to within some tolerance). A number of recent studies have, however, demonstrated that, broadly speaking, detection performance declines with larger turning angles (Ernst et al., 2012; Geisler et al., 2001; Hess, Hayes, & Field, 2003; Pettet et al., 1998). The more the curve “zigs and zags,” the less detectable it is.

*H*) are assumed to be generated identically and independently (i.i.d.) from a von Mises distribution centered on straight (collinear),

_{C}*β*analogous to the inverse of the variance. The i.i.d. von Mises model makes a reasonable generic model of smooth contours, because it satisfies basic considerations of symmetry and smoothness in a maximally general way. Specifically, the model is the maximum entropy (least “informative”) distribution satisfying the following constraints. First, it is centered at 0°, which is required by basic considerations of symmetry, namely that in an open contour (without figure/ground assigned) there is no way to know in which direction we are traversing the contour, so left and right turns are indistinguishable. Second, the turning angle has some variance about this mean, meaning that turning angles deviate from 0° (straight) with probability that decreases with increasing turning angle; this is inherent in the notion of “smoothness” and by definition, at a sufficiently fine scale a smooth contour is well approximated by its local tangent (see Singh & Feldman, 2013.) Any reasonable model of turning angles in smooth, open contours must satisfy these constraints, and because it has maximum entropy given these constraints, the i.i.d. von Mises model does so in the most general way possible (see Jaynes, 1988). Notably, empirically tabulated turning angle distributions drawn from subjectively chosen contours obey these constraints, albeit with some differences in functional form (Elder & Goldberg, 2002; Geisler et al., 2001; Ren & Malik, 2002).

^{2}

*likelihood model*for angles under the smooth contour hypothesis (

*H*), which indicates the probability that each particular turning angle under this hypothesis holds. In practice it is often more mathematically convenient to work with a Gaussian distribution of turning angle, which, for small angles, is nearly identical numerically to the von Mises model.

_{C}^{3}Either distribution (von Mises or Gaussian) captures the idea that, at each point along a smooth contour, the contour is most likely to continue straight (zero turning angle), with larger turning angles increasingly unlikely.

*α*] = [

_{i}*α*

_{1},

*α*

_{2}, …

*α*] of

_{N}*N*turning angles has probability given by the product of the individual angle probabilities, which, under the Gaussian approximation, equals

*internal*noise in obscuring this decision, while certainly important, is a separate matter that we take up in the Discussion. Our model is emphatically

*not*an ideal detector of contours. Indeed, as we discuss below, an ideal observer model for our task would be rather trivial, and would exhibit performance markedly different from that of our subjects. Instead, our model, being ignorant of many of the details of stimulus construction, makes a set of simplifying assumptions about the statistical structure of contours (some of which happen to be wrong for these displays) but that lead to some systematic predictions about performance, in particular that performance will decline with contour complexity. The goal of the analysis is not to understand how these specific stimuli can be optimally classified, but to understand how a broad set of assumptions about pattern structure lead to, and thus explain, certain striking characteristics of performance.

**Figure 1**

**Figure 1**

**Figure 2**

**Figure 2**

**Figure 3**

**Figure 3**

*C*, with von Mises distributed turning angles, or is simply part of a patch of random background pixels (Figure 3), which we will refer to as the “null” model,

*H*

_{0}. We can use Bayes's rule to assess the relative probability of the two models, and then extend the decision to a series of image patches extending the length of the potential contour.

*p*(

*α*|

*H*

_{0}) =

*ϵ*. In our displays,

*ϵ*= 1/3 because the curve could continue left (45°), straight, or right (45°), all with equal probability

^{4}, but here we state the theory in more general terms. In the null model, as in the contour model, we assume that turning angles are all independent conditioned on the model, so the likelihood of the angle sequence [

*α*] is just the product of the individual angle probabilities, reflecting a sequence of

_{i}*N*“accidental” turning angles each with probability

*ϵ*.

*H*and

_{C}*H*

_{0}, is given by the posterior ratio

*p*(

*C*|[

*α*]) /

_{i}*p*(

*H*

_{0}|[

*α*]). Assuming equal priors, which is reasonable in our experiments as targets and distractors appear equally often, this posterior ratio is simply the likelihood ratio,

_{i}*L*/

_{C}*L*

_{0}. As is conventional, we take the logarithm of this ratio to give an additive measure of the evidence in favor of the contour model relative to the null model, which is sometimes called the weight of evidence (WOE):

*M*, having probability,

*p*(

*M*), in an optimal code, making it a measure of

*complexity*of the message,

*M*.

^{5}Once the coding language is optimized, less probable messages take more bits to express. In our case, plugging in the expression for the likelihood of the contour model (Equation 4), the DL of the contour [

*α*] is just and the weight of evidence in favor of the contour model is the negative of the DL minus a constant.

_{i}*N*, the only term in Equation 8 that depends on the

*shape*of the contour is

*N*ln

*ϵ*term is a constant that does not depend on the observed turning angles). The larger this sum, the more the contour undulates, and the higher its DL. The larger the DL, the less evidence in favor of the smooth contour model; the smaller the DL, the smoother the contour, and the more evidence in favor of the smooth model.

*sufficient statistic*for this decision, meaning that it conveys all the information available to the observer about the contour's likelihood under the smooth contour model.

*N*, the length of the contour (note that ln

*ϵ*in these equations is a negative number because

*ϵ*<

*e*). By itself, this prediction is somewhat less interesting because it would probably follow from any reasonable model so, in what follows, we mostly focus on the complexity effect, which is more specific to our approach. However, it is worth noting that this prediction falls out of our model in a natural way without any ad hoc assumptions about the influence of contour length.

^{2}. A contour would be embedded in one of two target images. The contour was either a “long” contour (220 pixels, 5.25° if the contour was entirely straight) or a “short” contour (110 pixels, 2.62° if entirely straight). The contour was generated by sampling sequential turning angles (

*α*) using a discretized approximation to a von Mises distribution. Continuing straight (

*α*= 0) is more common than a turning angle right or left (

*α*=

*π*/4 or

*α*= −

*π*/4). Once the contour was generated, it was dilated using the dilation mask [0 1 0;1 1 0;0 0 0], chosen so that a perfectly straight contour oriented horizontally or vertically was detected as easily as one oriented at an arbitrary angle in a pilot study involving author JDW as the observer. Next, to prevent the contour from appearing at a different scale from the random pixels in the image, the image was increased in size to 550 × 550 pixels, using nearest-neighbor interpolation so that each pixel in the smaller image became a 2 × 2 pixel area in the image. Finally, the contour was embedded in one of the target images. The contour was randomly rotated, and then placed at a random location in the image, with the restriction that the contour could not extend beyond the edge of the image.

^{2}) and white pixels (98 cd/m

^{2}). The images were created by randomly sampling a binary matrix of size 225 × 225, and then scaling the matrix to 550 × 550 using nearest-neighbor interpolation, just as in the target images.

^{2}, or 64% contrast. Figure 1 shows a sample stimulus at the contrast used in the experiments; stimuli in Figure 2 show enhanced contrast. There were 40 images of each length and each complexity, resulting in a total of 40 × 2 × 5 = 400 trials. The luminance of the contour, image size, and contour lengths were chosen so that, in a pilot study, subjects performed near 75% correct.

*p*= 1.9 × 10

^{−6}). Five of 10 subjects showed a significant complexity effect (

*p*< 0.05) in the short contour condition, and five of 10 showed a significant effect in the long contour condition. All but two of the subjects showed a significant complexity effect in at least one of the length conditions. In an aggregate analysis (Figure 5), there was a main effect of complexity,

*F*(4, 89) = 18.55,

*p*= 4.07 × 10

^{−11}, but none of contour length and no interaction at the

*p*= 0.05 level. Looking at contour length in more detail, eight out of 10 subjects showed a superiority for long contours (individually significant in two subjects,

*p*< .05), while two in 10 showed a very small (nonsignificant) superiority for short contours. These results together suggest at best a marginal trend towards the expected effect of contour length. The slope of the best fitting regression line to the combined subjects' data was −0.027 for short contours and −0.02 for long contours.

**Figure 4**

**Figure 4**

**Figure 5**

**Figure 5**

*contour cocircularity*(minimization of change in turning angle), which has also been found to play a role in the perception of contour smoothness (Feldman, 1997; Motoyoshi & Kingdom, 2010; Parent & Zucker, 1989; Pettet, 1999). As suggested in Singh & Feldman (2013), our probabilistic contour model can easily be extended to by adding a distribution over the “next higher derivative” of the contour tangent. Turning angle

*α*is a discretization of the derivative of the tangent with respect to arclength (i.e., curvature), and change in turning angle Δ

*α*is a discretization of the second derivative of the tangent with respect to arclength (i.e., change in curvature). To incorporate this, we simply assume that, Δ

*α*(just like

*α*itself) is von Mises distributed, where

*β*

_{Δ}is a suitably chosen spread parameter. Under the “basic” von Mises model, turning angles were i.i.d., but under the augmented model, successive angles are biased to have similar values, with probability decreasing with deviation from cocircularity, as well as with increasing turning angle itself as before. As before, this von Mises distribution is very closely approximated by a Gaussian centered at 0°, meaning that, in the augmented model, the joint likelihood

*p*(

*α*, Δ

*α*|

*H*) is well approximated by a bivariate Gaussian centered at (0°, 0°).

_{C}*α*] exactly as before, as −log(

_{i}*p*([

*α*]) summed along the contour. The new DL measure penalizes contours that deviate from circularity in addition to deviation from straightness, meaning that contours are penalized not only when they bend but when the degree of bend changes.

_{i}*α*and Δ

*α*are not independent (

*r*= .83), meaning that the influence of DL due to change in turning angle is confounded with the DL due to turning angle itself. Nevertheless, as a reasonable post hoc analysis, we can compare the DL due to the original model with the augmented DL incorporating the cocircularity bias. This analysis yields mixed results. Using a linear regression onto log odds, we find that the added factor of Δ

*α*DL provides a significantly better fit to the data than the turning angle DL by itself (by a chi-square test of fit with nested models) in the case of short contours (

*p*= 0.033), but only marginally significant in the case of long contours (

*p*= 0.084). This suggests, consistent with previous literature, that contour cocircularity plays a measurable role in the detection of smooth contours, albeit a smaller one than collinearity itself. We emphasize, however, that this result should be interpreted with caution because of the high correlation between smoothness factors inherent in the design of the experiment.

**Figure 6**

**Figure 6**

*more*concentrated regions of distinct luminance. Of course, the complexity effect goes in the other direction, with lower-complexity contours being

*easier*on average to find so, again, this tendency (which is apparently negligible in any case, given Figure 6) could not explain the observed complexity effect.

*tip*), fourth eleventh (

*between*the tip and the middle), or sixth eleventh (the

*middle*) of the contour. This manipulation effectively modulates the length of the longest straight segment while holding complexity (DL) approximately constant. Contours with the bending concentrated at the tip, for example, will have relatively long straight segments, while having on average no higher or lower complexity than those with the bend concentrated in the middle. The aim of the experiment is to determine whether this factor, rather than DL itself, modulates performance.

*tip*), the fourth eleventh of the contour (

*between*the tip and the middle, or the sixth eleventh of the contour (the

*middle*). There were 396 trials (four different surprisals and three locations, with 33 trials per crossed conditions).

*F*(3, 42) = 18.99,

*p*= 6.18 × 10

^{−8}, and no main effect of bend location,

*F*(2, 42) = 0.94,

*p*= 0.397.

**Figure 7**

**Figure 7**

**Figure 8**

**Figure 8**

**Figure 9**

**Figure 9**

**Figure 10**

**Figure 10**

*why*curvature has this effect is still poorly understood, however, as reflected in the lack of mathematical models that can adequately capture it. In this paper we have modeled contour detection as a statistical decision problem, showing that a few simple assumptions about the statistical properties of smooth contours make fairly strong predictions about the performance of a rational contour detection mechanism, specifically that its performance will be impaired by contour complexity. Contour complexity, in turn, involves contour curvature as a critical term (the only term that reflects the shape of the contour), thus giving a concrete quantification of the effects of contour curvature on detection performance. The data strongly corroborate the effect of complexity on contour detection, showing strong complexity effects in almost every subject. Moreover, the spatial distribution of the complexity does not seem to determine performance, suggesting that a purely local complexity measure (such as the information/complexity defined at each contour point) will not suffice to explain performance, and a holistic measure of contour complexity (such as contour description length) is required.

*were*able to accurately detect the contours, suggesting that the human visual system is doing something different than these algorithms.

*β*in the von Mises distribution that generates the turning angles (see Equation 10), which in turn depends on the condition. In contrast, each distractor path consists of a series of random turns in which the probability of straight continuation is

*ϵ*(1/3, as explained above; see Figure 3). The observer's goal is to determine, based on the sample of turns observed, whether the sample was generated from the smooth process (i.e., a target display) or the null process (distractor display). Because successive turning angles are assumed independent, this means that a smooth curve consists, in effect, of a sample of

*N*(219 or 109 on long and short trials, respectively) draws from a Bernoulli process with high success probability (which varies by condition but is often very high, > 0.9), while noise consists of a sample with success probability 1/3.

*not*have full access to the pixel content of both images, which contain far more information than can be fully acquired under our speeded and masked experimental conditions. A variety of simple assumptions can be made about how image information is limited or degraded, but most give predictions that differ in various ways from what we observed. For example, one might assume that subjects only apprehend a part of the target contour, rather than the contour in its entirety; this predicts subceiling performance, but also predicts a much smaller complexity effect than is actually observed. In any case, the results of Experiment 2 strongly suggest that observers are sensitive to the complexity of the entire contour, not just to short segments of it. Similarly, one can imagine various search strategies that observers might adopt, which would modulate the probability that the target contour is encountered, but as explained above, such strategies cannot explain the effect of contour complexity.

*N*(0,

*σ*

^{2}) or ±45° + 0 +

*N*(0,

*σ*

^{2}) where the standard deviation

*σ*is a parameter fit to each observer. This simple one-parameter model provides a good fit our subjects' data (Figure 4), including both the absolute level of performance and the magnitude of the complexity effect. All subjects' model fits were fairly similar, with a mean

*σ*= 24.45° and standard deviation of 0.35° in the short contour condition, and a mean

*σ*= 25.06° and standard deviation of 0.41° in the long contour condition. This model accommodates the complexity effect predicted by the original classification model, while also reflecting, in a particularly simple way, the stimulus uncertainty present in the real visual system.

*σ*parameter vary separately for the three conditions (tip, between, or middle) or we constrained

*σ*to be the same in all of these conditions. Using the likelihood ratio test we found that the additional model complexity involved in letting

*σ*vary for each condition was not justified for any of our subjects, suggesting that subjects were sensitive to contour complexity in approximately the same way throughout the entire contour. The likelihood ratio test compares the likelihood of a model with with multiple parameters to a nested model with fewer parameters to see if the inclusion of these additional parameters is justified. Specifically it computes where

*C*is the constrained version of the model (in our case, a one-parameter model),

*U*is the unconstrained model (in our case, the three-parameter model in which each bend location has its own angle noise parameter), and Λ is and where

*M*is a model (either constrained or unconstrained),

*n*= 12 is the number of conditions (one for each contour complexity crossed with each bend location)

*P*is the proportion correct for a condition, and

_{m}*P*is the subject performance in that condition.

_{s}*χ*

^{2}distribution with degrees of freedom equal to the difference in the number of parameters of the models (in our case 2). This results in

*p*values ranging from 0.76–1, with a mean of 0.84 and standard deviation of 0.08. The quality of the fits in the constrained model were not significantly worse than the fits in the unconstrained model, so we are not justified in using a model that treats the bend locations differently. The constrained model had similar parameters to the fits in Experiment 1, a mean

*σ*= 24.88°, and standard deviation of 0.33°.

*complexity*, defined from an information-theoretic point of view as the negative log of the stimulus probability under the generative model, that is the Shannon complexity or description length (DL) of the pattern. The ubiquitous influence of simplicity biases and complexity effects in perception and cognition more generally can be seen as a consequence of the general tendency for simple patterns to be more readily distinguishable from noise (Chater, 2000; Feldman, 2000) than complex ones (Feldman, 2004). In past studies, contour detection has usually been studied as a special problem unto itself whose properties derived from characteristics of visual cortex. Instead, we hope that in the future the problem of contour detection can be treated as a special case of a broader class of pattern detection problems, all of which can be studied in a common mathematical framework in which they differ only in details of the relevant generative models (Feldman, Singh, & Froyen, in press). This observation opens the door to a much broader investigation of perceptual detection problems encompassing detection of patterns and processes well beyond simple contours, such as closed contours and whole objects.

*, 61 (3), 183–193.*

*Psychological Review**, 10, 433–436.*

*Spatial Vision**, 8 (4), 425–455.*

*IEEE Transactions on Pattern Analysis and Machine Intelligence**, 8 (6), 679–698.*

*IEEE Transactions on Pattern Analysis and Machine Intelligence**, 407, 572–573.*

*Nature**, 15, 11–15.*

*Commununications of the ACM*

*Journal of Vision**,*2 (4): 5, 324–353, http://www.journalofvision.org/content/2/4/5, doi: 10.1167/2.4.5. [PubMed] [Article]

*, 8 (5), e1002520.*

*PLoS Computational Biology**, 63 (7), 1171–1182.*

*Perception & Psychophysics**, pp. 331–357). Providence, RI: American Mathematical Society.*

*Partitioning data sets*(Vol. 19*, 37, 2835–2848.*

*Vision Research**, 407, 630–633.*

*Nature**, 93, 199–224.*

*Cognition**, 112, 243–252.*

*Psychological Review**(pp. 55–70). New York: Springer Verlag.*

*Shape perception in human and computer vision: An interdisciplinary perspective**. Oxford, UK: Oxford University Press.*

*Oxford handbook of computational perceptual organization**, 4, 2379–2394.*

*Journal of the Optical Society of America**, 33, 173–193.*

*Vision Research**, 13, 51–66.*

*Spatial Vision**, 41, 711–724.*

*Vision Research**, 107, 677–708.*

*Psychological Review**, 35, 1699–1711.*

*Vision Research**, 3, 480–486.*

*Trends in Cognitive Sciences**, 97, 105–119. doi:10.1016/j.jphysparis.2003.09.013.*

*Journal of Physiology–Paris**. Washington, DC: U.S. Patent and Trademark Office.*

*Method and means for recognizing complex patterns. U.S. Patent No. 3,069,654**(pp. 25–29). Dordrecht, The Netherlands: Kluwer Academic.*

*Maximum-entropy and Bayesian methods in science and engineering**, 90, 7495–7497.*

*Proceedings of the National Academy of Sciences, USA**, 40, 1775–1783. Retrieved from http://www.sciencedirect.com/science/article/pii/S0042698900000080, doi: http://dx.doi.org/10.1016/S0042-6989(00)00008-0.*

*Vision Research**. West Sussex, UK: Wiley.*

*Bayesian statistics: An introduction*(Third ed.)*, 88, 2846–2856. doi:10.1152/jn.00289.2002.*

*Journal of Neurophysiology**Proceedings of the Royal Society of London Series B*.

*, 207 (1167), 187–217.*

*Biological Sciences**, 10 (1): 3, 1–8, http://www.journalofvision.org/content/10/1/3, doi: 10.1167/10.1.3. [PubMed] [Article]*

*Journal of Vision**, 11, 823–839.*

*IEEE Transactions on Pattern Analysis & Machine Intelligence**, 10, 437–442.*

*Spatial Vision**, 39, 551–557.*

*Vision Research**, 38, 865–879.*

*Vision Research**2002 ( Vol. 2350, pp. 312–327). Berlin, Germany: Springer Verlag.*

*Computer vision–ECCV**, 1, 121–142. doi:10.1068/i0384.*

*i-Perception**, 27, 379–423, 623–656.*

*Bell System Technical Journal**. Oxford, UK: Oxford University Press.*

*Oxford handbook of perceptual organization**, 119, 678–683.*

*Psychological Review**, 102, 939–944.*

*Proceedings of the National Academy of Sciences, USA**, 47, 783–798.*

*Vision Research**, 119, 325–340.*

*Cognition**. Cambridge, MA: MIT Press.*

*Neural information processing systems 16 (NIPS 2003)**, 6, 1.*

*Frontiers in Neuroscience*^{2}Note also that these differences would result in only minor differences in the predictions that follow. For example, the more peaked turning-angle distribution observed in Geisler et al. (2001) and Ren and Malik (2002) would predict a slightly weaker complexity effect than the von Mises model, but would not qualitatively alter the results.

^{3}The two distributions' Taylor series are the same up to the first two first terms (see Feldman & Singh, 2005).