Free
Author Response to Letter  |   July 2015
Can a Bayesian analysis account for systematic errors in judgments of 3D shape from texture? A reply to Saunders and Chen
Author Affiliations
Journal of Vision July 2015, Vol.15, 22. doi:https://doi.org/10.1167/15.9.22
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      James T. Todd; Can a Bayesian analysis account for systematic errors in judgments of 3D shape from texture? A reply to Saunders and Chen. Journal of Vision 2015;15(9):22. https://doi.org/10.1167/15.9.22.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Saunders and Chen (2015) have recently proposed a Bayesian model of the perception of 3D shape from texture that they claim is superior to an alternative model based on scaling contrast that was originally proposed by Todd, Thaler, Dijkstra, Koenderink, and Kappers (2007). This commentary will review a variety of empirical findings that are relevant to this issue, and it will also evaluate how well these findings can be explained by different possible models that have been proposed in the literature. The results will demonstrate that the scaling contrast model can account for almost all of the factors that can influence apparent shape from texture with just two free parameters. The Bayesian model of Saunders and Chen has a greater number of free parameters than the scaling contrast model, yet it is not sufficiently developed to make quantitative predictions about any of these factors. Moreover, there are several other processes that would be necessary to actually implement their model that are not mentioned in their theoretical discussion. These include some mechanism for measuring veridical slant from texture, and a mechanism for computing global 3D shape from local estimates of optical slant.

Introduction
The perception of shape from texture has been an active area of research for over six decades since the phenomenon was first introduced by James Gibson (1950). As a general rule, observers' judgments of depth or slant from texture tend to be systematically underestimated, but there are a wide variety of factors that can modulate this effect. In an effort to provide a unified explanation of how these factors influence perceptual performance, Todd et al. (2007) introduced a new source of information for estimating 3D shape from texture called scaling contrast, which is highly correlated with perceptual judgments over a remarkably broad range of conditions. 
Saunders and Chen (2015) argue that distortions of apparent slant from texture may be theoretically misleading. They contend that observers' internal representations of slant are veridical, but that perceptual judgments can be biased by flatness cues, such as the absence of accommodative blur gradients, or a Bayesian prior for fronto-parallel surfaces. However, these arguments are almost entirely speculative. Saunders and Chen do not present any data to show that internal slant estimates are veridical; they do not propose a model to show that veridical slant estimates can in fact be computed from texture in a manner that is compatible with human perception; and they do not present any simulations to show that flatness cues or fronto-parallel priors can successfully account for existing research findings on this topic. 
Perceptual distortions of perceived shape from texture
Before discussing the details of their paper, it is useful to summarize some of the systematic errors in the perception 3D shape from texture that have been reported previously. Several of these effects were revealed in an experiment performed by Todd et al. (2007). They presented observers with images of hyperbolic cylinders such as the ones shown in Figure 1. The slants of the asymptotic planes were systematically manipulated, as was the sign of curvature, the camera angle with which the images were rendered, and the viewing angle with which they were observed. The task on each trial was to adjust the shape of a smooth curve with four degrees of freedom to match the apparent shape of a surface cross-section in depth. 
Figure 1
 
Example images from Todd et al. (2007) of concave and convex hyperbolic cylinders that were rendered with different camera angles. The asymptotic slant of each depicted surface is 55°.
Figure 1
 
Example images from Todd et al. (2007) of concave and convex hyperbolic cylinders that were rendered with different camera angles. The asymptotic slant of each depicted surface is 55°.
The red curves in Figure 2 show the average results of these adjustments in each condition and the black curves show the actual shapes of the depicted cross-sections. Note in these data that the convex surfaces produced more apparent depth than the concave surfaces (see also Todd, Thaler, & Dijkstra, 2005). This can be seen quite clearly in Figure 1, in which all of the depicted surfaces have identical asymptotic slants of 55°. This effect is also immediately evident when the displays are viewed from the correct visual angles, which cannot be achieved when they are combined in a single figure. Another effect that can be seen quite clearly in Figure 1 is that the displays rendered with a 60° camera angle produce more apparent depth than the ones rendered with a 20° angle (see also Todd et al., 2005). However, the viewing angle with which the images were observed produced the opposite effect. That is to say, the ones observed with a 60° viewing angle produced less apparent depth than the ones viewed with a 20° angle (see also Saunders & Backus, 2006). One final thing to note in these data is that the distortions of apparent shape are nonlinear, and cannot be explained by a simple compression in depth. This is revealed in Figure 2 by the reductions in apparent curvature at the tips of the cylinders in the Camera 60° − View 60° conditions, and the small amounts of apparent curvature in the asymptotic planes for several of the concave surfaces. 
Figure 2
 
Data and simulations from Todd et al. (2007). The red curves show the judged depth profiles in each condition averaged over observers and textures. The black curves show the ground truth, and the dashed blue curves show the shapes predicted by Equation 1.
Figure 2
 
Data and simulations from Todd et al. (2007). The red curves show the judged depth profiles in each condition averaged over observers and textures. The black curves show the ground truth, and the dashed blue curves show the shapes predicted by Equation 1.
Todd et al. (2007) tried several procedures to model the pattern of judged shapes across the different experimental conditions, and the best fits to the data were obtained using the following equation:  where Zϕ is the relative depth of a surface point in a visual direction (ϕ), Sϕ is the projected length of an optical texture element in that direction, SMax and SMin are the maximum and minimum projected texture lengths in the entire image, and k is a constant. Astute readers will recognize that Equation 1 is a form of contrast normalization—a mechanism that is observed in a wide variety of sensory processes. That is why the model is referred to as scaling contrast. Todd et al. (2007) also provided a mathematical analysis to show that scaling contrast uniquely specifies the depth contrast within a visible scene.  
One limitation of Equation 1 is that it overestimates the magnitude of apparent depth when the texture in peripheral vision has a high spatial frequency. Todd et al. (2007) corrected for this by limiting the field of view over which scaling contrast is calculated for the convex surfaces viewed with large visual angles. The dashed blue curves in Figure 2 show the estimated surface cross-sections based on this model using a single value of k across all of the different conditions. Note that this produces almost perfect fits to the observers' judgments, and that it captures all of the different effects described above involving camera angles, viewing angles, and the sign of surface curvature. 
Additional support for the scaling contrast model has more recently been provided by Todd et al. (2010). They asked observers to perform an adjustment task to indicate the apparent slants of planar surfaces like the ones shown in Figure 3, with depicted slants of 0°, 30°, 50°, or 70°. The stimuli were rendered under orthographic projection or with camera angles of 15°, 30°, or 60°, but they were all viewed at a fixed visual angle of 15°. The filled circles in Figure 4 show the average adjusted slants for each condition. The solid curves in this figure show the expected psychometric functions for each camera angle that were generated from Equation 1. Note that the model captures all of the systematic variations in apparent slant due to the depicted slant and camera angle. Moreover, because there were no large viewing angles in this experiment, these fits were achieved with just one free parameter (k). Given the remarkably accurate fits of perceived shape from texture provided by the scaling contrast model, it sets a high bar for any alternative model that is proposed to replace it. 
Figure 3
 
Four images of a planar surface with a 50° slant used by Todd et al. (2010). From left to right, these images were rendered with camera angles of 60°, 30°, 15°, and orthographic projection.
Figure 3
 
Four images of a planar surface with a 50° slant used by Todd et al. (2010). From left to right, these images were rendered with camera angles of 60°, 30°, 15°, and orthographic projection.
Figure 4
 
The average judged slants obtained by Todd et al. (2010) for images of planar surfaces rendered with different camera angles. The solid curves show the expected psychometric functions for each camera angle that were generated from Equation 1.
Figure 4
 
The average judged slants obtained by Todd et al. (2010) for images of planar surfaces rendered with different camera angles. The solid curves show the expected psychometric functions for each camera angle that were generated from Equation 1.
A Bayesian analysis of 3D slant from texture
Let us now consider the alternative model proposed by Saunders and Chen (2015). According to their account, texture information is represented as a likelihood function, whose mean has a veridical correspondence to the depicted slant, and whose variance increases with decreasing slant. The spread of the likelihood function represents the range of slants that would be consistent with the available information given the presence of internal noise (e.g., from measurement error). When multiple sources of information are available, their likelihood functions are multiplied and normalized to estimate a posterior probability. For example, when observing an image of a textured surface on a projection screen or computer monitor, there is conflicting information to specify a frontal surface from the absence of accommodative blur gradients (Watt, Akeley, Ernst, & Banks, 2005). The information from conflicting frontal cues can be modeled as an additional likelihood function with peak at zero, which is multiplied with the likelihood function for texture to produce a systematic underestimation of apparent slant. Note that this model requires at least three free parameters, one to specify the variance of the frontal cue likelihood function, and two (or more) others to specify how the variance of the texture likelihood functions change with depicted slant. 
There are several aspects of Saunders and Chen's analysis (2015) that deserve to be highlighted. First, their model is only described as a general concept, and they do not flesh out any details about the relevant likelihood distributions. This is especially problematic for evaluating the model empirically, because it does not make precise quantitative predictions about anything in its current state of development. Thus, there is no way of determining if a set of distributions exists that is capable of predicting the specific pattern of judged slants among the various conditions in the experiment they report to test their model. 
Another important aspect of Saunders and Chen's analysis (2015) is that they say nothing at all about how a veridical likelihood function might be computed from an actual textured image. This omission is especially surprising because the veridicality of slant representations is a critical distinction between their model and scaling contrast. In the next section I will describe some possible methods for computing slant from texture that could, in principle, be used as a front end for their Bayesian analysis. 
Computation of slant from texture
There are two traditional models for computing slant from texture that were first proposed in the 1950s (see Purdy, 1958) and have been studied extensively since then. These models are based on specific local measures of optical texture elements that are defined in Figure 5. One common approach for estimating surface slant is to analyze the foreshortening of optical texture elements based on the assumption that the texture is isotropic. It is possible in that case to estimate the local optical slant (σ) at a given surface location using the following equation:  where S′ is the projected major axis of an optical texture element, and C′ is the compression of its projected minor axis.  
Figure 5
 
The variables used to estimate local optical slant from foreshortening and scaling gradients.
Figure 5
 
The variables used to estimate local optical slant from foreshortening and scaling gradients.
An alternative approach is to estimate surface slant by measuring the changes of optical texture across different local neighborhoods of an image, based on an assumption that the surface texture is statistically homogeneous. This can be achieved using the following equation:  where D′ is the projected distance between neighboring optical texture elements in the direction of slant, and S1 and S2 are the projected lengths of those texture elements (see Figure 5). In the limit of an infinitesimally small D′, the right side of Equation 3 equals the normalized scaling gradient (Gårding, 1992). It is interesting to note that Equations 1 and 3 are quite similar to one another. The primary difference is that scaling contrast exploits the entire range of scale differences within an image, whereas the normalized scaling gradient only considers them within local neighborhoods. Because of this difference, the scaling gradient is much more sensitive to noise than is scaling contrast. For example, if a 2% error were applied to the measured values of S1 or S2 within a 5° neighborhood, the estimated slant would deviate from the ground truth by 13° for a 5° slant and 3° for a 60° slant. For scaling contrast, on the other hand, a 2% error in the measured values of SMax or S′Min would produce estimated slant errors of only 1.6° and 1.5°, respectively, for depicted slants of 5° and 60°.  
When evaluating these alternative models, it is important to emphasize that Equations 2 and 3 do not measure the physical slant of a surface. What they measure instead is optical slant, which is defined as the angle between the viewing direction of a visible surface patch and the local surface normal of that patch. Consider the perceptual analysis of a planar surface from texture. By definition, all local regions on a planar surface have exactly the same physical slant. However, they do not all have the same optical slant (σ), because σ varies linearly with viewing direction for planar surfaces (see Todd et al., 2005). For example, if a 50° slanted surface is observed with a visual angle of 60° (see Figure 3A), the optical slants will vary from a maximum value (σMax) of 80° to a minimum value (σMin) of 20°. 
Psychometric relation between judged slant and the ground truth
It is important to recognize that the scaling contrast model makes no predictions whatsoever about how texture should be combined with other sources of information. Thus, the only aspect of Saunders and Chen′s (2015) experiment that is directly relevant to evaluating this model is the curved psychometric function they obtained with monocular Voronoi textures. These results are quite similar to those obtained by Todd, Christensen, and Guckes (2010) under comparable conditions (see Figure 4), and they are strongly predicted by the scaling contrast model. They are not compatible, however, with the foreshortening or scaling gradient models, which would produce linear psychometric functions in this condition. I suspect that the Bayesian model could also account for these results by tweaking the variance of the likelihood functions for different depicted slants, but this cannot be determined with certainty in the absence of any numerical simulations. 
Effects of camera angle and viewing angle
When an image of a textured surface is viewed with a visual angle that matches the camera angle with which it was rendered or photographed, the apparent depth and slant of the surface increases systematically with the field of view (Todd et al., 2005; Todd et al., 2007). This is a huge effect that that can cause a fivefold difference in apparent slant as the field of view is increased from 5° to 50°, and it is a strong prediction of the scaling contrast model. Saunders and Chen (2015) argue that texture information becomes more reliable with large fields of view, which could, in principle, be incorporated in their model by adding additional free parameters to specify how the spreads of the likelihood functions are increased or decreased as a function of viewing angle. Although this may appear at first blush as an obvious prediction of a Bayesian model, it is much less obvious when it is pointed out that increasing the field of view also increases the variance of optical slant measures across different local regions. 
Is there any evidence to support their claim that slant judgments are more reliable with large fields of view? In a classic paper on slant discrimination from texture, Knill (1998a) argued that observers′ thresholds are primarily determined by the maximum value of optical slant (σMax). He also demonstrated that when σMax is held constant, changes in the field of view that increase or decrease σMin have no significant effect on performance. Because Saunders and Chen (2015) assume that discrimination thresholds of slant from texture are proportional to the spreads of the likelihood distributions, it follows from their hypothesis that changes in the field of view that do not alter σMax should have no significant effect on judged slant. As it turns out, many of the field of view manipulations in Todd et al. (2005) and Todd et al. (2007) satisfied this condition, and the results in those cases do not confirm the Bayesian prediction. For example, when viewed from the correct visual angles, the two concave surfaces depicted in Figure 1 both have identical σMax values of 55°, but the apparent depth-to-width ratio of the 60° image is 80% larger than observers′ perceptions of the 20° version. This finding indicates that Saunders and Chen (2015) will have a difficult time explaining the effects of field of view within their proposed Bayesian perspective. 
When an image of a textured surface is viewed with a visual angle that differs from the camera angle with which it was rendered, the effects are even more theoretically revealing. If the 3D shape of a depicted surface were estimated from local scaling gradients as described by Equation 3, then the judged depth-to-width ratio obtained with a 20° viewing angle would be 3.27 times larger than what would be obtained if the same image were viewed at a 60° angle. The opposite effect would occur if shape were computed from local texture foreshortening as described by Equation 2. The empirical results of Todd et al. (2007) revealed that the apparent depth-to-width ratios of hyperbolic cylinders were on average 1.55 times larger for the 20° viewing angles than for the 60° viewing angles. Thus, the direction of the effect was consistent with what would be expected from an analysis of local scaling gradients, but the magnitude was much smaller (see also Saunders and Backus, 2006). As is shown in Figure 2, the precise magnitude of this effect is predicted quite accurately by the scaling contrast model. 
It is also interesting to note in this regard that the effects of viewing angle for a fixed visual image are the opposite of what would be expected if the reliability of texture information increases with the field of view as suggested by Saunders and Chen (2015). These authors have argued in a personal communication, that images with incompatible camera and viewing angles are best considered as cue conflict stimuli, and that the effects obtained by Saunders and Backus (2006), Todd et al. (2007) and Todd et al. (2010) could potentially be explained as a linear combination of the estimated slants from scaling gradients and foreshortening, in which scaling gradients are weighted more heavily. However, there is a serious problem with this argument. Knill (1998b) performed an ideal observer perturbation analysis to measure the impact of different sources of information on slant discrimination performance, and the results revealed that foreshortening is more heavily weighted than scaling by a large margin. 
The fact that observers can discriminate differences in foreshortening does not necessarily indicate that foreshortening is a perceptually useful source of information about surface slant, or that the discrimination of texture patterns tells us anything at all about slant perception. If Saunders and Chen (2015) were to argue for this reason that Knill′s (1998a, 1998b) discrimination results are most likely irrelevant to the issues considered in this commentary, I would immediately concede the point. However, I doubt they would ever propose such an argument, because it would cast doubt on all of the existing evidence that the reliability of slant estimates from texture is greater for large slants than for small slants. This is the cornerstone of their theoretical analysis, but it is based entirely on discrimination data, even though the standard interpretation of those data has been challenged by Todd et al. (2010). 
Computing 3D shape from texture
The systematic variations of optical slant as a function of viewing direction pose a difficult problem for computational models of 3D shape from texture that are based on Equations 2 or 3. If texture information provided accurate measures of local physical slant, then these measures could be integrated to compute an accurate estimate of global 3D shape. However, if local optical slants are integrated, the resulting estimates of 3D shape will be systematically distorted. For example, integrating the optical slants on planar surfaces produces global shape estimates that are highly curved (e.g., see Todd & Thaler, 2010). For planar surfaces, it might be possible to compute local physical slant by subtracting the visual direction from local measures of optical slant, but this strategy will not work for curved surfaces or images of planar surfaces viewed from an incorrect visual angle. How then do we explain how the image in Figure 3A appears planar, even when it is viewed with a visual angle that is only 1/10 the size of the camera angle with which it was rendered? 
Saunders and Chen (2015) do not discuss how it is possible within their Bayesian framework to obtain judgments of overall 3D shape. The scaling contrast model, on the other hand, addresses this issue explicitly. Note in Equation 1 that this model estimates the relative depth at each point on a surface rather than local optical slant. Thus, it provides a direct estimate of 3D shape without requiring a process of integration. It is important to emphasize that these shape estimates are not veridical. This is demonstrated quite clearly in Figure 2, which shows the simulated shapes of hyperbolic cylinders computed from Equation 1 over a wide range of conditions. Although many of these estimated shapes are significantly distorted relative to the ground truth, they are nonetheless quite similar to the shape judgments obtained from human observers. 
Effects of sign of curvature
Another large effect in the perception of 3D shape from texture involves the appearance of concave and convex surfaces (Todd et al., 2005; Todd et al., 2007). Note in Figure 1, for example, that the two convex surfaces on the right appear to have larger asymptotic slants than the two concave surfaces on the left, even though the asymptotic slants of all four depicted surfaces are identical. This effect cannot be explained by the foreshortening or scaling gradient models, but the precise magnitude of the effect is again predicted quite accurately by scaling contrast (see Figure 2). The fact that Saunders and Chen (2015) do not mention this effect highlights the limitations of their analysis, and it is not at all clear how their model might be modified to account for it. One possibility might be to invoke a convexity prior, but this would entail still more free parameters, and it would be difficult to implement within an architecture that has no representation of surface curvature or global 3D shape. 
Effects of noise
For regular textures like the ones in Figures 1 and 3, the scaling contrast model can easily be implemented using direct measures of the optical texture elements in an image. For example, I have used this model to predict the apparent 3D shapes of op-art paintings such as the ones produced by Victor Vasarely, for which the ground truth is unknown. Implementing this model is more difficult, however, when there are random variations in the sizes and or shapes of the texture elements. Todd et al. (2007) argued that random variation among physical texture elements makes it necessary to average over an appropriately large neighborhood of optical texture in order to obtain a stable measure of scaling contrast. This could potentially be achieved using a method developed by Malik and Rosenholtz (1997) that computes variations in texture scaling by comparing the amplitude spectra of local image patches (see also Thaler, Todd, and Dijkstra, 2007). The averaging of optical texture elements will inevitably decrease the measured value of S′Max and increase the measured value of S′Min. Thus, the effects of averaging for irregular textures would be expected to reduce the apparent depths of surfaces, which is consistent with the empirical findings of Todd et al. (2005) and Todd et al. (2007). 
Saunders and Chen (2015) express skepticism that the effects of averaging will be large enough to account for the differences in apparent depth between regular and irregular textures. They argue that decreased texture regularity lowers the reliability of texture information, so that its weight is reduced relative to conflicting frontal cues or a frontal prior. Although this is a plausible hypothesis, I suspect it will be difficult to implement as an actual working model. They could do so in a post hoc manner by measuring the perceptual gain for each possible texture pattern, and then adjusting the variance of the likelihood distribution by hand. This would result in a different free parameter for every possible texture one might choose to examine, which cannot be considered as a satisfactory explanation. A more plausible implementation of their approach would require some unspecified mechanism for measuring texture regularity, so that the spread of the likelihood functions can be modulated automatically based on available information. However, Saunders and Chen (2015) offer no suggestions about how that might be accomplished. 
Effects of orthographic projection
Another theoretically interesting issue in the analysis of 3D shape from texture involves images that are rendered using orthographic projection. This type of projection does not affect texture foreshortening, but it eliminates all variations in texture scaling. Thus, if observers′ judgments were based on foreshortening, then the use of orthographic projection should have a negligible effect on perceptual performance. If, on the other hand, their judgments were based on scaling contrast or scaling gradients, then surfaces viewed under orthographic projection should all appear fronto-parallel. In order to evaluate these predictions, it is useful to consider the image of a planar surface under orthographic projection in Figure 3D. The depicted slant in this image is 50°, which could easily be determined from an analysis of texture foreshortening as defined by Equation 2. However, observers all report that it appears to have a fronto-parallel orientation (see Figure 4). This finding provides strong evidence that foreshortening per se is not a perceptually relevant source of information for the estimation of local surface slant. 
The scaling contrast model is able to predict almost all of the phenomena discussed thus far with precise quantitative accuracy. However, there is one other effect in the perception of shape from texture that cannot be explained by this model even in principle. Figure 6A shows the image of an ellipsoid surface with an isotropic texture under orthographic projection, and Figure 6B shows the same surface with an anisotropic texture. The appearance of depth and curvature in these images cannot be explained by scaling contrast or scaling gradients because there are no systematic variations of texture scaling. The foreshortening model could potentially explain the appearance of Figure 6A, but not the appearance of Figure 6B because it has an anisotropic texture. 
Figure 6
 
Example images of an ellipsoid surface under orthographic projection with an isotropic texture (A), and an anisotropic texture (B).
Figure 6
 
Example images of an ellipsoid surface under orthographic projection with an isotropic texture (A), and an anisotropic texture (B).
Todd and Thaler (2010) have recently proposed a generalization of the foreshortening model that is applicable to surfaces with anisotropic textures. If a surface is depicted with negligible perspective and its texture is homogeneous, then the local physical slant (σ) at a particular location along a given scan line of the surface can be computed using the following equation:  where w is the width of an optical texture element at that location in the direction of the scan line, and wMax is the maximum width in that direction among all the elements on the scan line. This model provides an excellent account of observers′ perceptions of shape from texture for surfaces rendered under orthographic projection or with small camera angles. However, when applied to images with significant amounts of perspective it produces systematic errors that are incompatible with observers′ shape judgments. For example, an analysis of directional width gradients predicts that the planar surface depicted in Figure 3A should appear highly curved. Because of its inability to cope with perspective, Todd and Thaler (2010) argued that this model is only viable if used in conjunction with scaling contrast. According to this account, observers′ judgments will be based on scaling contrast whenever there are measurable systematic variations of scaling within the pattern of optical stimulation, and that directional width gradients are used as a fall back whenever these variations are absent.  
There have been numerous published articles that have highlighted the theoretical significance of orthographic projections for evaluating models of the perception of 3D shape from texture, but the discussion of Saunders and Chen (2015) does not consider this issue. It is not at all clear how the apparent shapes of the surfaces depicted in Figures 3 and 5 could be predicted by their model, and this must be considered as a serious weakness, given that the analyses of Todd et al. (2007) and Todd and Thaler (2010) provide precise quantitative predictions of all of these effects. 
An evaluation of competing models
By the standard criteria for evaluating models, the one proposed by Saunders and Chen (2015) does not fare well. In its most basic form, it has more free parameters than the scaling contrast model, yet it cannot explain any of the phenomena discussed above. They do propose some speculative explanations for the effects of noise or field of view, but their assertions about the latter are contradicted by the available empirical evidence. Perhaps the most salient weakness of their analysis is the absence of any numerical simulations to provide quantitative predictions about perceptual performance. One possible reason for this is that there are several other processes that would be necessary to actually implement this model that are not considered in their discussion. These include some mechanism for measuring veridical slant from texture, a mechanism for measuring texture regularity, and a mechanism for computing global 3D shape from local estimates of optical slant. 
For purposes of comparison, the scaling contrast model in conjunction with directional width gradients can account for almost all of the effects described above in a quantitatively precise manner with only two free parameters, and one of those parameters is only necessary with large fields of view. Some additional development is needed to explain the effects of texture regularity, but a promising blueprint for that has been provided by Malik and Rosenholtz (1997). Another important aspect of this model that deserves to be highlighted is that it is based on a contrast normalization mechanism that is found in many other sensory systems for processing information that can vary over a high dynamic range. Thus, it should not be surprising that this same mechanism was co-opted for the perception of 3D shape from texture. 
Perhaps it may be possible to construct a model from a Bayesian perspective that is able to compete with scaling contrast by providing precise quantitative fits for all of the phenomena discussed above with a small number of free parameters. However, the model proposed by Saunders and Chen (2015) falls far short of that standard. 
Acknowledgments
This research was supported by a grant from the National Science Foundation (BCS-0962119). 
Commercial relationships: none. 
Corresponding author: James T. Todd. 
Email: todd.44@osu.edu. 
Address: Department of Psychology, Ohio State University, Columbus, Ohio. 
References
Gårding, J. (1992). Shape from texture for smooth curved surfaces in perspective projection. Journal of Mathematical Imaging and Vision, 2, 327–350.
Gibson J. J. (1950). The perception of the visual world. Boston, MA: Haughton Mifflin.
Knill D. C. (1998a). Discriminating surface slant from texture: Comparing human and ideal observers. Vision Research, 38, 1683–1711.
Knill D. C. (1998b). Ideal observer perturbation analysis reveals human strategies for inferring surface orientation from texture. Vision Research, 38, 2635–2656.
Malik J., Rosenholtz R. (1997). Computing local surface orientation and shape from texture for curved surfaces. International Journal of Computer Vision, 23, 149–168.
Purdy W. P. (1958). The hypothesis of psychophysical correspondence in space perception. Dissertation Abstracts International, 42, 1454. (University Microfilms No. 58-5594).
Saunders J. A., Backus B. T. (2006). The accuracy and reliability of perceived depth from linear perspective as a function of image size. Journal of Vision, 6 (9): 7, 933–954, doi:10.1167/6.9.7. [PubMed] [Article]
Saunders J. A., Chen Z. (2015). Perceptual biases and cue weighting in perception of 3D slant from texture and stereo information. Journal of Vision, 15 (2): 14, 1–24, doi:10.1167/15.2.14. [PubMed] [Article]
Thaler L., Todd J. T., Dijkstra T. M. H. (2007). The effects of phase on the perception of 3D shape from texture: Psychophysics and modeling. Vision Research, 47, 411–427.
Todd J. T., Christensen J. C., Guckes K. M. (2010). Are discrimination thresholds a valid measure of variance for judgments of slant from texture? Journal of Vision, 10 (2): 20, 1–18, doi:10.1167/10.2.20. [PubMed] [Article]
Todd J. T., Thaler L. (2010). The perception of 3D shape from texture based on directional width gradients. Journal of Vision, 10 (5): 17, 1–13, doi:10.1167/10.5.17. [PubMed] [Article]
Todd J. T., Thaler L., Dijkstra T. M. H. (2005). The effects of field of view on the perception of 3D slant from texture. Vision Research, 45, 1501–1517.
Todd J. T., Thaler L., Dijkstra T. M. H., Koenderink J. J., Kappers A. M. L. (2007). The effects of viewing angle, camera angle and sign of surface curvature on the perception of 3D shape from texture. Journal of Vision, 7 (12): 9, 1–16, doi:10.1167/7.12.9. [PubMed] [Article]
Watt S. J., Akeley K., Ernst M. O., Banks M. S. (2005). Focus cues affect perceived depth. Journal of Vision, 5 (10): 7, 834–862, doi:10.1167/5.10.7. [PubMed] [Article]
Figure 1
 
Example images from Todd et al. (2007) of concave and convex hyperbolic cylinders that were rendered with different camera angles. The asymptotic slant of each depicted surface is 55°.
Figure 1
 
Example images from Todd et al. (2007) of concave and convex hyperbolic cylinders that were rendered with different camera angles. The asymptotic slant of each depicted surface is 55°.
Figure 2
 
Data and simulations from Todd et al. (2007). The red curves show the judged depth profiles in each condition averaged over observers and textures. The black curves show the ground truth, and the dashed blue curves show the shapes predicted by Equation 1.
Figure 2
 
Data and simulations from Todd et al. (2007). The red curves show the judged depth profiles in each condition averaged over observers and textures. The black curves show the ground truth, and the dashed blue curves show the shapes predicted by Equation 1.
Figure 3
 
Four images of a planar surface with a 50° slant used by Todd et al. (2010). From left to right, these images were rendered with camera angles of 60°, 30°, 15°, and orthographic projection.
Figure 3
 
Four images of a planar surface with a 50° slant used by Todd et al. (2010). From left to right, these images were rendered with camera angles of 60°, 30°, 15°, and orthographic projection.
Figure 4
 
The average judged slants obtained by Todd et al. (2010) for images of planar surfaces rendered with different camera angles. The solid curves show the expected psychometric functions for each camera angle that were generated from Equation 1.
Figure 4
 
The average judged slants obtained by Todd et al. (2010) for images of planar surfaces rendered with different camera angles. The solid curves show the expected psychometric functions for each camera angle that were generated from Equation 1.
Figure 5
 
The variables used to estimate local optical slant from foreshortening and scaling gradients.
Figure 5
 
The variables used to estimate local optical slant from foreshortening and scaling gradients.
Figure 6
 
Example images of an ellipsoid surface under orthographic projection with an isotropic texture (A), and an anisotropic texture (B).
Figure 6
 
Example images of an ellipsoid surface under orthographic projection with an isotropic texture (A), and an anisotropic texture (B).
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×