A new computational analysis is described for estimating 3D shapes from orthographic images of surfaces that are textured with planar cut contours. For any given contour pattern, this model provides a family of possible interpretations that are all related by affine scaling and shearing transformations in depth, depending on the specific values of its free parameters that are used to compute the shape estimate. Two psychophysical experiments were performed in an effort to compare the model predictions with observers' judgments of 3D shape for developable and non-developable surfaces. The results reveal that observers' perceptions can be systematically distorted by affine scaling and shearing transformations in depth and that the magnitude and direction of these distortions vary systematically with the 3D orientations of the contour planes.

*N*) at any point along a contour can be computed from the following equation:

*β*is the 2D angle between the 2D ruling vector and a vector tangent to the 2D contour, and

*α*is the

*z*-component of the surface ruling vector (i.e., the direction of zero curvature). Note that

*β*can be measured from 2D image data, whereas

*α*is an unknown free parameter. It is important to recognize that lines of curvature on a developable surface are also planar cuts, so that the set of images for which Stevens' analysis is applicable constitutes a subset of those that can be generated using the method of inclined planes.

*N*

_{S}(

*s*)) at any point along a contour can be computed using the following differential equation:

*κ*(

*s*) is the curvature of the contour parameterized in terms of arc length,

*n*(

*s*) is the 2D normal vector to the contour parameterized in terms of arc length,

*r*(

*s*) is the 2D surface ruling parameterized in terms of arc length, and

*V*is the viewing direction. In order to solve this equation, it is necessary to specify the surface normal at one point along the contour to provide the initial conditions. Because the slant and tilt of this normal are unknown, the estimated shape is ambiguous up to a 2-parameter family of possible solutions. Although geodesics on a surface are generally not planar cuts (except along lines of curvature), their optical projections are often quite similar, especially for surfaces that are slanted in depth. Thus, this analysis could potentially be used to obtain a qualitatively correct interpretation of 3D shape for planar cut contours on developable surfaces (Knill, 1992).

*σ*and a tilt

*τ*. Let us begin by constructing an index to represent the order of the contour planes and proportionally subdividing the distances between them to obtain a continuous scale in texture space. This makes it possible to assign a unique value (

*v*) of the texture scale to any given point (

*x, y*) in an image based on the particular contours in its immediate local neighborhood. For convenience, we will align the

*y*-axis of the image plane coordinate system to be parallel to the tilt of the contour planes. From the texture scale value at each point on the surface, we can compute the relative depth of that point from the following equation:

*S*is a scaling factor that defines how the texture index is related to distances in physical space. It is important to recognize that

*σ, τ,*and

*S*are unknown free parameters. Thus, this analysis can determine the relative pattern of depth on a surface up to a three-parameter family of possible interpretations, without requiring any assumptions about the underlying surface geometry. There is one degenerate case, however, that deserves to be highlighted. If the slants of the contour planes are close to ±90°, then Equation 3 cannot be evaluated. The image contours in that case are reduced to a pattern of parallel straight lines that provide no useful information about 3D shape.

*R*

^{2}) was 0.84. A similar analysis was used to measure the consistency among all possible pairs of observers, and the average value of

*R*

^{2}for these correlations was 0.79.

*F*(4, 24) = 12.13,

*p*< 0.001. Affine correlations between the judged depth profiles and the ground truth were performed to measure the magnitude of these perceptual distortions in each condition using the following linear model:

*Z*′ is the judged depth of a given probe point,

*X*is its horizontal position along the scan line,

*Z*is the true depth of the point in physical space, and (Xshear, Zscale) are the affine coefficients to be estimated. The results of this analysis are presented in Table 1. If observers' perceptions had been based on the analyses proposed by Stevens (1981, 1986) or Knill (1992, 2001), then their judgments should have been most accurate in Condition B, because that is the only condition for which the contours were aligned along surface geodesics or lines of curvature. However, the results do not confirm this prediction. The judged depth profiles in Condition D had the least amount of shear relative to the ground truth. That is the condition in which the horizontal scan line was perpendicular to the tilt of the contour planes.

Condition | R ^{2} | Xshear | Zscale |
---|---|---|---|

A | 0.98 | −0.18 | 0.56 |

B | 0.99 | −0.13 | 0.54 |

C | 0.99 | −0.07 | 0.54 |

D | 0.98 | −0.03 | 0.53 |

E | 0.93 | 0.14 | 0.47 |

*Mathematica*was used to evaluate Equation 2. Because the analyses proposed by Stevens and Knill compute local surface normals along a contour, it was necessary to compute the local depth gradients from the normals, and then integrate the gradient function along a scan line in order to determine the predicted depth profile for any given set of parameter values.

*R*

^{2}for the individual fits over all observers, models, and conditions was 0.89, which is somewhat noisier than those obtained from the composite data that had an average

*R*

^{2}of 0.94. In all other respects, however, the individual and composite fits revealed the same basic pattern of results.

Condition | Fit to ground truth | Fit to data | AIC | ||
---|---|---|---|---|---|

R ^{2} | Slope | R ^{2} | Slope | ||

Stevens' model | |||||

A | 0.93 | 3.24 | 0.96 | 2.26 | 158.8 |

B | 0.99 | 1.01 | 0.97 | 1.85 | 130.6 |

C | 0.99 | 1.46 | 0.98 | 3.04 | 285.9 |

D | 0.99 | 11.50 | 0.96 | 6.60 | 13203.3 |

E | 0.58 | 66.67 | 0.82 | 44.10 | 41399.2 |

Knill's model | |||||

A | 0.99 | 0.88 | 0.87 | 2.12 | 278.8 |

B | 0.99 | 1.01 | 0.96 | 1.27 | 71.3 |

C | 0.99 | 0.93 | 0.97 | 1.54 | 70.8 |

D | 0.99 | 0.89 | 0.96 | 1.81 | 87.3 |

E | 0.99 | 0.88 | 0.74 | 1.94 | 80.6 |

Planar cut model | |||||

A | 0.99 | 1.02 | 0.98 | 0.98 | 53.0 |

B | 0.99 | 1.00 | 0.99 | 0.98 | 47.6 |

C | 0.99 | 1.00 | 0.99 | 0.99 | 48.7 |

D | 0.99 | 1.00 | 0.98 | 0.98 | 53.5 |

E | 0.99 | 1.00 | 0.96 | 0.96 | 56.0 |

*S, σ,*and

*τ*parameters among different observers averaged over conditions were 0.13, 10.40°, and 3.98°, respectively.

Condition | S | σ | τ |
---|---|---|---|

A | 0.60 | −60 | 25 |

B | 0.53 | −57 | 14 |

C | 0.52 | −44 | 10 |

D | 0.58 | −37 | −1 |

E | 0.47 | −44 | −5 |

*σ*and

*τ*parameters across conditions, it is unlikely that these were determined by statistical priors. Rather, these results suggest that there is visual information within the 2D images to estimate the orientations of the contour planes. One likely source of information about the tilts of these planes is provided by the orientations of the 2D contours. Indeed, the best-fitting tilts are almost perfectly correlated with the direction of the average 2D contour normal within each image (

*R*= 0.99). There is also a high negative correlation between the best-fitting slants and the amplitudes of the 2D contours (

*R*= −0.88). The reason for this latter effect, we suspect, is that the apparent slant of a surface increases with the 2D amplitude of its projected contours and that the estimated slants of the contour planes are negatively related to the apparent surface slants.

*R*

^{2}was 0.89. A similar analysis was used to measure the consistency among all possible pairs of observers, and the average

*R*

^{2}for these correlations was 0.73.

*Mathematica*. An affine correlation was then performed to compare the judged patterns of relief with the ground truth using the following linear model:

*Z*′ is the judged depth of a given probe point, (

*X, Y,*and

*Z*) are the Cartesian coordinates of that point in physical space, and (Xshear, Yshear, Zscale) are the affine coefficients to be estimated. The results of this analysis are presented in Table 4. We also performed affine correlations on the judgments obtained from individual observers. The average

*R*

^{2}values for the individual correlations were slightly lower than those for the composite data (0.86 versus 0.92), but the relative pattern of coefficients in the different conditions was nearly identical. These findings confirm the theoretical prediction of the planar cut model that the possible interpretations for any given contour pattern are related to one another by affine scaling and shearing transformations in depth.

Object (σ, τ) | R ^{2} | Xshear | Yshear | Zscale |
---|---|---|---|---|

O1 (30°, 0°) | 0.92 | 0.02 | 0.39 | 0.76 |

O1 (30°, 90°) | 0.92 | 0.29 | 0.03 | 0.83 |

O2 (30°, 180°) | 0.92 | 0.00 | −0.39 | 0.72 |

O2 (0°, 0°) | 0.93 | 0.03 | −0.01 | 0.78 |

*S, σ,*and

*τ*) are not independent—i.e., they can all have a coordinated influence on both shear and depth scaling. Thus, an additional analysis was performed to determine how closely the planar cut model could account for the specific patterns of distortion in these data. This was achieved by implementing the model for the four different contour patterns and computing the values of its free parameters that provide the best least-squares fits to the average response profiles. The results of this analysis (see Table 5) reveal that the planar cut model can account, on average, for 85% of the variance in the perceived relative depths between the different probe points in each condition, and similarly good fits were also obtained from analyses of the individual observers.

Object (σ, τ) | R ^{2} | S | σ | τ |
---|---|---|---|---|

O1 (30°, 0°) | 0.83 | 0.78 | 4 | 9 |

O1 (30°, 90°) | 0.83 | 0.83 | 10 | 91 |

O2 (30°, 180°) | 0.84 | 0.75 | 6 | 174 |

O2 (0°, 0°) | 0.89 | 0.70 | 3 | 18 |

*v*) of the texture scale to any given point (

*x, y*) in an image based on the particular contours in its immediate local neighborhood. The discussion thus far has implied that the contour planes must be equally spaced to create a uniform texture scale, but this is clearly too strong an assumption. Indeed, the stimuli employed in Experiment 1 were all generated with contour planes that were unequally spaced. Note in Figure 2, for example, that the black stripes appear narrower than those with a lighter gray color. This perception is most likely based on the fact that the 2D widths of the black image contours are consistently smaller than their neighboring gray contours. This suggests that the relative spacing of contours in local regions of image space could be used to scale the relative spacing of the contour planes in 3D space.

*z*-axis is perpendicular to the image plane, the relative depth of each point along a planar cut contour can be determined from the following equation:

*σ*and

*τ*are the slant and tilt of the inclined plane. The problem with this approach is that the number of free parameters grows with the number of contours to be analyzed, although the number of constraints grows as well because points of intersection between any two contours must have the same depth (see Ecker, Kutulakos, & Jepson, 2007). As a practical matter, we suspect that this is only feasible for regularly shaped contours (as in the right panel of Figure 7) whose slants and/or tilts can be reliably estimated from their optical projections.

*Acta Psychologica*, 131, 178–193. [CrossRef] [PubMed]

*Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition*, 383–390.

*Journal of Mathematical Imaging and Vision*, 2, 327–350. [CrossRef]

*Perception*, 23, 1335–1337. [CrossRef] [PubMed]

*Artificial Intelligence*, 37, 333–353. [CrossRef]

*Journal of the Optical Society of America A*, 9, 1449–1464. [CrossRef]

*Journal of the Optical Society of America A*, 18, 12–35. [CrossRef]

*Perception*, 30, 321–330. [CrossRef]

*Perception*, 30, 431–448. [CrossRef] [PubMed]

*Perception*, 30, 403–410. [CrossRef] [PubMed]

*Vision Research*, 40, 217–242. [CrossRef] [PubMed]

*Journal of Vision*, 4, (10):3, 860–878, http://www.journalofvision.org/content/4/10/3, doi:10.1167/4.10.3. [PubMed] [Article] [CrossRef]

*Vision Research*, 44, 2135–2145. [CrossRef] [PubMed]

*International Journal of Computer Vision*, 23, 149–168. [CrossRef]

*Vision Research*, 41, 2653–2668. [CrossRef] [PubMed]

*Journal of Experimental Psychology: Human Perception and Performance*, 16, 653–664. [CrossRef] [PubMed]

*Geographical Review*, 47, 507–520. [CrossRef]

*Journal of Vision*, 6, (9):7, 933–954, http://www.journalofvision.org/content/6/9/7, doi:10.1167/6.9.7. [PubMed] [Article] [CrossRef]

*Perception*, 37, 1471–1487. [CrossRef] [PubMed]

*Artificial Intelligence*, 17, 47–53. [CrossRef]

*From pixels to predicates*(pp. 93–110). Norwood, NJ: Ablex.

*Biological Cybernetics*, 56, 355–366. [CrossRef] [PubMed]

*Geographical Journal*, 79, 213–219. [CrossRef]

*Trends in Cognitive Science*, 8, 115–121. [CrossRef]

*Vision Research*, 42, 837–850. [CrossRef] [PubMed]

*Psychological Science*, 15, 40–46. [CrossRef] [PubMed]

*Journal of Experimental Psychology: Human Perception and Performance*, 16, 665–674. [CrossRef] [PubMed]

*Vision Research*, 45, 1501–1517. [CrossRef] [PubMed]

*Journal of Vision*, 7, (12):9, 1–16, http://www.journalofvision.org/content/7/12/9, doi:10.1167/7.12.9. [PubMed] [Article] [CrossRef]

*Psychological Review*, 109, 91–115. [CrossRef] [PubMed]

*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 17, 120–135. [CrossRef]

*Journal of Vision*, 5, (10):7, 834–862, http://www.journalofvision.org/content/5/10/7, doi:10.1167/5.10.7. [PubMed] [Article] [CrossRef]

*Communication of the ACM*, 15, 100–103. [CrossRef]

*IEEE Transactions on Computers*, C-32, 28–33.