January 2012
Volume 12, Issue 1
Free
Article  |   January 2012
Relative flattening between velvet and matte 3D shapes: Evidence for similar shape-from-shading computations
Author Affiliations
Journal of Vision January 2012, Vol.12, 2. doi:10.1167/12.1.2
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Maarten W. A. Wijntjes, Katja Doerschner, Gizem Kucukoglu, Sylvia C. Pont; Relative flattening between velvet and matte 3D shapes: Evidence for similar shape-from-shading computations. Journal of Vision 2012;12(1):2. doi: 10.1167/12.1.2.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Among other cues, the visual system uses shading to infer the 3D shape of objects. The shading pattern depends on the illumination and reflectance properties (BRDF). In this study, we compared 3D shape perception between identical shapes with different BRDFs. The stimuli were photographed 3D printed random smooth shapes that were either painted matte gray or had a gray velvet layer. We used the gauge figure task (J. J. Koenderink, A. J. van Doorn, & A. M. L. Kappers, 1992) to quantify 3D shape perception. We found that the shape of velvet objects was systematically perceived to be flatter than the matte objects. Furthermore, observers' judgments were more similar for matte shapes than for velvet shapes. Lastly, we compared subjective with veridical reliefs and found large systematic differences: Both matte and velvet shapes were perceived more flat than the actual shape. The isophote pattern of a flattened Lambertian shape resembles the isophote pattern of an unflattened velvet shape. We argue that the visual system uses a similar shape-from-shading computation for matte and velvet objects that partly discounts material properties.

Introduction
The visual world comprises of objects (“things”) and material (“stuff”; Adelson, 2001). We visually perceive these indirectly, by the way light is reflected from objects. The illumination, object shape, and the reflectance contribute to the shading pattern that reaches the retina. Painters seem to (implicitly) understand this process: They can convey the shape, illumination, and reflectance properties of a scene by mere shading, such as the painting of Alfred Stevens in Figure 1. Our visual system appears to master the reverse process of reconstructing a 3D scene from a 2D projection, including attributes such as reflectance. This task is formally impossible to solve (it is mathematically underconstrained), and it is, therefore, no surprise that our visual system cannot perceive the world veridically. The retinal signal is inherently ambiguous, yet humans are hardly aware of this because the ambiguity itself is not present in the percept. Although infinitely many worlds could have led to a certain retinal image, our perception is unitary (Koenderink, 2001). This leads to the illusion that perception is (up to some noise) veridical. However, veridical perception is formally impossible and, furthermore, evolutionary implausible (Mark, Marion, & Hoffman, 2010). 
Figure 1
 
The Blue Dress (oil on panel) by Alfred Emile Stevens (1823–1906). Sterling and Francine Clark Art Institute, Williamstown, USA. The Bridgeman Art Library.
Figure 1
 
The Blue Dress (oil on panel) by Alfred Emile Stevens (1823–1906). Sterling and Francine Clark Art Institute, Williamstown, USA. The Bridgeman Art Library.
To a certain extent, ambiguities can be resolved. To do so, the visual system makes use of prior knowledge, e.g., the prior that light comes from above and slightly to the left (Sun & Perona, 1998) or the convex shape prior (Langer & Bülthoff, 2001). The mere existence of these priors indicates that perception is not veridical, since prior information is based on averages. 
Interactions between shape, illumination, and reflectance
Understanding visual perception implies understanding how ambiguities are resolved. This can be studied by investigating the interaction of illumination, shape, and reflectance perception. Both reflectance (material) and shape perception are influenced by illumination. For example, rendering a glossy object in a diffuse light field 1 results in a perceptually matte object (Dror, Willsky, & Adelson, 2004; Pont & Te Pas, 2006). This effect is likely due to the absence of highlights for objects illuminated by a diffuse light field. The geometry of the light field dictates the distribution of highlights on a glossy surface. Indeed, it has recently been shown that for image-based illumination (Debevec, 1998) the geometry of the light field determines the level of perceived gloss (Doerschner, Boyaci, & Maloney, 2010; Olkkonen & Brainard, 2010), although the exact relation between light field geometry and gloss is still unknown. The direction of the illumination also affects material perception. It has been shown that surface roughness perception depends on the illumination direction (Ho, Landy, & Maloney, 2006): Increasing the angle between the illumination direction and surface normal results in an increased perceived roughness. Interestingly, this illusory roughness increase did not depend on the presence of contextual cues relating to the illumination direction. This does not mean that in general the interaction between light and material is one way. In fact, it has been shown that reflectance can also influence the perception of illumination direction (Khang, Koenderink, & Kappers, 2006; Pont & Koenderink, 2007). 
The characteristics of illumination also affect the perception of 3D shape, for example, the dominant illumination direction changes shape inference substantially (Caniard & Fleming, 2007; Christou & Koenderink, 1997; Todd, Koenderink, van Doorn, & Kappers, 1996). Koenderink, van Doorn, Kappers, and Todd (2001) proposed that ambiguities in 3D shape perception are generally linear: They can be accounted for by an affine transformation. This hypothesis is based on the assumption that observers can reliably differentiate between planar and curved surfaces. Indeed, Koenderink et al. (2001) found that differences between observers could be described by the affine transformation. However, in this study, the illumination conditions were not varied. Nefs, Koenderink, and Kappers (2005) showed that differences arising from different light fields have a substantial nonlinear component. Thus, the affine transformation does not seem to capture all subjective differences. 
Lastly, it has been shown that the perception of 3D shape and reflectance interact. Norman, Todd, and Orban (2004) showed that perception of 3D shape discrimination is improved when specular highlights are present. Using a different paradigm, Nefs, Koenderink, and Kappers (2006) found no difference in shape perception between matte and glossy objects. Besides an early investigation on the perception of curvature for matte and glossy surfaces (Todd & Mingolla, 1983), there are no relevant studies concerning the interaction between shape and material except Khang, Koenderink, and Kappers (2007). They studied perception of ellipsoid shapes rendered with four different bidirectional reflection distribution functions (BRDFs): Lambertian, specular, asperity (velvet), and backscattering. It was found that shapes were perceived differently for these four BRDFs. In particular, they found that velvety shapes were perceived less precise than specular shapes. In addition, it was found that velvet and backscattering shapes appear more flat than Lambertian or specular shapes. For backscattering, this effect can intuitively be verified by looking at the moon, which is a backscatterer. A full moon appears as a flat disk rather than as a sphere due to the optical effect that all light is scattered back to the observer without producing a gradient in the image. 
Previous studies have mainly used virtual, rendered objects as stimuli. This has a number of advantages since the object geometry can be used to compare with perceptual judgments (although this is often omitted), and more importantly, the BRDF can be easily implemented. In the study presented here, we have used real shapes, which were produced using a 3D printer. We wanted to understand the influence of BRDF on shape perception using more ecologically valid stimuli, although the actual stimuli were photographs of the real objects and thus still somewhat “virtual.” We choose to compare Lambertian scattering with asperity scattering. The actual BRDFs of our stimuli only qualitatively resemble ideal BRDFs. We used matte paint that is not a perfectly diffuse scatterer and a flock layer that is not a perfect homogeneous layer of asperities. Nevertheless, because of these imperfections, they are more realistic. We conducted a gauge figure task experiment for two different, random shapes. Both shapes had matte and velvet versions, which means we used a total of four stimuli. 
Methods
Observers
Eight observers (5 males and 3 females, mean age of 37 years) participated in the experiment, including one of the authors (SP). The observers were colleagues from the same laboratory as the main author. Participation was voluntary. Three observers had participated in gauge figure experiments previously (SP, HN, and HR), while for the other five observers, this was the first time. All but SP were naive with respect to the purpose of the study and had not seen the stimuli before, either in reality or on pictures. All had normal or corrected-to-normal vision. 
Stimuli
The stimuli were generated in 3D Studio MAX by modulating 3D spheres with low-frequency sinusoids. The resulting objects were then printed using an EDEN250TM 16 micron layer 3-Dimensional Printing System of Objet, Belgium (see Figure 2). One print of each shape was painted with gray (Ral 7035) matte spray paint. The other print of each shape was treated with gray (color matched to Ral 7035) flock to induce a velvety layer. 
Figure 2
 
Stimuli for Experiment 1. Two different random smooth shapes that were either painted matte or flocked velvety.
Figure 2
 
Stimuli for Experiment 1. Two different random smooth shapes that were either painted matte or flocked velvety.
All four stimuli were photographed with a Canon 400D. Lighting was provided by a theater spotlight that was directed toward the white ceiling of the photo studio, creating a diffuse light source from above. The focal point of the lens was set at 200 mm, the aperture value at F22, ISO 400, and exposure at 3.2 s. A white probe was used to correct the white balance. The picture presentation on a LaCie CRT screen during the experiment was calibrated to be linear. The image itself was 783 pixels wide, and the width of the stimulus was approximately 520 pixels. The screen was 40 cm wide and 30 cm high, and the resolution was 1600 by 1200 pixels. The head of the observer rested on a chin rest, with one eye patched and the other 57 cm away from the screen. The stimulus thus subtended a visual angle of approximately 13 degrees. 
Procedure
We used the “gauge figure task” (Koenderink, van Doorn, & Kappers, 1992; Wijntjes, 2011) to quantify the perceived 3D surface structure. In this task, observers are instructed to manipulate an attitude probe such that it appears to lie flat on the pictorial surface. The attitude probe consists of a circle with a rod protruding from the middle. A typical setting is shown in Figure 3. The lines of the gauge figure were 2 pixels wide and colored red. The radius of the circle and the length of the perpendicular rod were both 15 pixels. The observer could manipulate the 3D orientation (slant and tilt) with the mouse. The slant is the angle with the image plane normal and the tilt is the rotation in the image plane. 
Figure 3
 
Illustration of the method. The task was to adjust the gauge figure probe such that it lies flat on the pictorial surfaces. This was performed in random order over all sampling points. From the data, a 3D surface was reconstructed.
Figure 3
 
Illustration of the method. The task was to adjust the gauge figure probe such that it lies flat on the pictorial surfaces. This was performed in random order over all sampling points. From the data, a 3D surface was reconstructed.
The gauge figure was shown at the barycentra of a triangular grid that covered the stimulus as shown in Figure 3. The triangulation of shape 1 had 246 barycentra, while shape 2 had 241 barycentra. Both triangulations consisted of 145 vertices. The gauge figure was presented one at a time, and the order of barycentra was randomized. The stimulus order was counterbalanced for the group of observers. It was ensured that an observer did not see the same shape (of different BRDF) subsequently. 
Data analysis
Each gauge figure setting represents a local surface attitude, which can be interpreted as a depth gradient. The depth gradients can be integrated to a surface, which comprises the (x,y) values of the triangulation vertices and perceived depth values z. The depth values were compared to analyze perceptual differences between matte and velvet shapes. First, we quantified possible depth compressions for the velvet shapes by performing a linear regression between matte and velvet conditions. If the slope (a) of the regression z velvet = az matte + b is smaller than 1, depth is compressed; if a > 1, depth is extended. This was analyzed within observers (between BRDF). Second, the interobserver similarity was quantified by calculating the adjusted R 2 (coefficient of determination) of the regression between the depth values of each stimulus. The higher the adjusted R 2, the higher the similarity in perceived shape and the lower the level of ambiguity. Besides this “straight” regression that reveals depth compression (a) and similarity (R 2), we also performed affine regressions. It has been proposed by Koenderink et al. (2001) that when one assumes that “planes can reliably be differentiated from curved surfaces, owing to cues such as shading and so forth,” image ambiguities are described by the affine transformation z(x,y) = az + b + cx + dy. It is thus similar to the straight regression with the addition of a plane cx + dy. The affine regression basically captures all linear differences between perceived depths. If the adjusted R 2 of the affine regression (taking into account the extra two parameters c and d) is significantly larger than the straight regression adjusted R 2, the difference between depths is attributed to the affine plane, since the compression/stretch parameter a is already present in the straight regression. 
Since we used 3D prints of random stimuli of which we knew the geometry, we also wanted to compare our perceptual depths with the ground-truth depths. However, the 3D stimuli did not have any markers on them to gauge the orientations of the shapes. Therefore, we had to “manually” align the photographed shape with the 3D model. To do so, we wrote software with which we could interactively rotate the vertices until they appeared to be juxtaposed with the photographs. The result can be seen in Figure 4. The vertices were represented as dots, which makes the models translucent. The first author repeated the rotation task four times to assess the variance of this method. The average R 2 of all six pairs of the four repetitions was 0.995 and 0.997 for shapes 1 and 2, respectively. 
Figure 4
 
Screen captures of the matching method to manually align the 3D model (red dots) with the photos (background). On the right, a misaligned 3D model is shown for shape 2.
Figure 4
 
Screen captures of the matching method to manually align the 3D model (red dots) with the photos (background). On the right, a misaligned 3D model is shown for shape 2.
Results
Between- and within-observers analysis
The depth gain can be quantified by the slope of the straight regression. In Figure 5a, we plotted the average depth gains for the two stimuli. It can be clearly seen that for both shapes the velvet stimulus appears more flat than the matte stimulus. Both slopes are significantly different from 1 according to a paired t-test: t(7) = −3.74269, p < 0.05 and t(7) = −2.59789, p < 0.05 for the first and second stimuli, respectively. To assess whether there is some kind of depth compression invariance across different shapes, we correlated the depth gains of the first and second stimuli. We found a surprisingly high correlation of the flattening effect (r = 0.902, p = 0.002), which is shown in Figure 5b
Figure 5
 
(a) Average depth gain between velvet and Lambertian. N = 8; error bars denote 95% confidence interval of the mean. (b) Correlation between depth gains for the first and second shapes.
Figure 5
 
(a) Average depth gain between velvet and Lambertian. N = 8; error bars denote 95% confidence interval of the mean. (b) Correlation between depth gains for the first and second shapes.
To assess the level of perceptual ambiguity of each of the individual stimuli, we pairwise correlated the perceived depths of all observer combinations (for 8 observers, this amounts to 8(8 − 1) / 2 = 28 pairs), shown in Figure 6. A two-by-two ANOVA revealed that there was no significant difference between the shapes (F(1, 109) = 3.24058, p = 0.0746). However, the velvet shapes yielded a significantly higher perceptual ambiguity, i.e., a lower coefficient of determination (F(1, 109) = 8.70619, p = 0.0039). Velvet shapes are, thus, perceived more ambiguously between observers than matte shapes. 
Figure 6
 
Perceptual consistency quantified by the adjusted coefficient of determination R 2. N = 28 (number of subject pairs); error bars denote 95% confidence interval of the mean.
Figure 6
 
Perceptual consistency quantified by the adjusted coefficient of determination R 2. N = 28 (number of subject pairs); error bars denote 95% confidence interval of the mean.
To see in detail whether the affine transformation significantly accounts for differences between observers, we plotted the straight and affine coefficient of determination (adjusted R 2) for each observer pair in Figure 7. In this figure, the gray bars denote the straight coefficient of determination and the white bars denote the affine coefficient of determination. The alternating grays help discriminate observers. The asterisks denote significant affine improvements. The total fraction of affine improvements over all observer pairs is expressed as a percentage. For the first shape, a relatively large difference is found for the amount of significant affine improvement between the matte (75%) and velvet (43%) stimuli. The difference between these proportions was significantly different (χ 2(1, N = 54) = 5.976, p = 0.014). For the second stimulus, the difference is qualitatively similar but much smaller (61% against 54%) and not significantly different (χ 2(1, N = 54) = 0.292, p = 0.589). A significant affine improvement reflects that the differences between observers are linear. Thus, the differences between observers in perceiving the velvet shape seem to be of relatively nonlinear nature for the first shape, but this difference is undetermined for the second shape. 
Figure 7
 
Pairwise straight and affine correlations between observers, within stimulus. Filled bars denote straight coefficients of determination; open bars denote affine coefficients of determination. Alternating gray values denote different observer pairs as also indicated by the numbers at the bottom of the bars.
Figure 7
 
Pairwise straight and affine correlations between observers, within stimulus. Filled bars denote straight coefficients of determination; open bars denote affine coefficients of determination. Alternating gray values denote different observer pairs as also indicated by the numbers at the bottom of the bars.
Performing the same affine analysis between BRDFs, within observers, yields the result shown in Figure 8. Here a somewhat different picture arises. For stimulus 1, the differences between the Lambertian and velvet stimulus seem largely (75%) linear, while the differences for the second stimulus are largely nonlinear (12%). 
Figure 8
 
Straight and affine correlations within subjects, between material (matte vs. velvet).
Figure 8
 
Straight and affine correlations within subjects, between material (matte vs. velvet).
Veridicality analysis
First, we correlated the veridical depths with the subjective (reconstructed) depths, which can be seen in Figure 9. A 2-way repeated measures ANOVA with material and shape as factors showed that overall the velvet shapes were perceived less veridical than the matte shapes (F(1, 7) = 7.392, p < 0.05). Furthermore, the ANOVA showed no significant effect for the stimulus factor or the interaction. The magnitude of the coefficients of determination (adjusted R 2) is of comparable size with those found in the between-subjects correlation as indicated by the dashed bars in Figure 9, although for shape 1 the correlation with veridical appears somewhat lower. 
Figure 9
 
(Top) Straight regression with respect to the veridical shape. Dashed bars show the results of Figure 6 for comparison. (Bottom) The depth gains with respect to the veridical shape are shown. N = 8; error bars denote 95% confidence interval of the mean.
Figure 9
 
(Top) Straight regression with respect to the veridical shape. Dashed bars show the results of Figure 6 for comparison. (Bottom) The depth gains with respect to the veridical shape are shown. N = 8; error bars denote 95% confidence interval of the mean.
However, as can be seen on the right side of Figure 9, a large difference between the veridical and subjective reliefs is found in the depth gain. In Figure 10, we plotted two views of the veridical shape 1 and the average subjective reliefs for the matte and velvet stimuli. As can be clearly seen, the subjective reliefs are much flatter than the actual shape. This is confirmed by the depth gains that are plotted in Figure 9
Figure 10
 
Three-dimensional visualizations of the veridical shape and the reconstructed shapes of the matte and velvet stimuli.
Figure 10
 
Three-dimensional visualizations of the veridical shape and the reconstructed shapes of the matte and velvet stimuli.
Discussion
The results show that visual perception of 3D shape depends on the reflectance properties. Velvet shapes appear more flat than matte painted shapes. The differences between subjective reliefs are substantially nonlinear, which depends both on the particular shape and the reflectance properties. Furthermore, we found that in comparison with the ground-truth geometry, the perceived depth of all stimuli is more than twice as low as the actual shape. 
As already noted by Khang et al. (2007), velvet shapes seem to be flattened in comparison with matte shapes. Apparently, the shape-from-shading mechanism used by the visual system makes systematic errors when confronted with a velvet shading. To understand this flattening effect, we rendered a velvet sphere with an analytic “velvet” BRDF described by Koenderink and Pont (2003) and compared it with a rendered Lambertian spheroid. We used a collimated light source that illuminated the shapes with a slant of 45 degrees, from above. The Lambertian spheroids were parameterized by x 2 + y 2 + (λz)2 = 1, where z is perpendicular to the image plane and λ ranged from 0.25 (oblate spheroid) to 1 (sphere). In Figure 11, the top row represents the renderings, the middle row shows the isophote patterns, and the bottom row shows the luminance histograms. As can be seen, both the isophote pattern and the luminance histogram shape of the velvet sphere resembles the isophote pattern and histogram shape of the flattened Lambertian shapes more than that of the actual spherical Lambertian shape. Although there may be many differences between these simple, rendered shapes and our real, photographed stimuli, it is not unlikely that such shading pattern and histogram resemblances may be responsible for the flattening effect. Furthermore, we took photographs of the stimuli with collimated light and plotted the luminance histograms in Figure 12. As can be seen in that figure, there appears to be a systematic shift between the brightest modes of the matte and velvet shapes. However, in the photographs, we can see that these brightest modes are probably due to different optical mechanisms and the brightest pixels are located at different positions. For the matte stimuli, we see highlights, and for the velvet ones, we see bright contours. Thus, in order to explain shape perception, we need to take into account the spatial characteristics of the luminance distribution, e.g., the isophote pattern. The isophotes effectively represent the direction orthogonal to the shading gradients. It is, therefore, more likely that the isophote pattern affects shape inference. 
Figure 11
 
Renderings of a velvet sphere (left) compared with Lambertian spheroids of varying depth compression. The middle row represents the isophote patterns. The bottom row represents the luminance histograms.
Figure 11
 
Renderings of a velvet sphere (left) compared with Lambertian spheroids of varying depth compression. The middle row represents the isophote patterns. The bottom row represents the luminance histograms.
Figure 12
 
The original stimuli (top two rows) and the stimuli photographed with collimated illumination (bottom two rows). The last column shows the luminance histograms in which blue denotes the matte objects and purple denotes the velvet objects.
Figure 12
 
The original stimuli (top two rows) and the stimuli photographed with collimated illumination (bottom two rows). The last column shows the luminance histograms in which blue denotes the matte objects and purple denotes the velvet objects.
If the velvet shading pattern is misinterpreted as Lambertian shading, the shape-from-shading computation may be biased to a flattened shape. However, the reverse may also be true: Assuming velvet shading for a matte object would result in a matte shape stretched in depth. It appears that shape-from-shading is computed with similar assumptions across different types of material. Thus, if shading gradients change as a function of BRDF, the change may be attributed to a difference in 3D shape and not be attributed to the material type (discounted). In support of this, we found a significant perceived flattening of velvety objects. Attributing changes in shading to changes in shape may also explain the often found violations of shape constancy for changing illumination (Caniard & Fleming, 2007; Christou & Koenderink, 1997; Nefs et al., 2005; Todd et al., 1996). The assumptions that underlie shape-from-shading likely come from experience. If that is true, it may be possible to model shape inference by assuming a canonical BRDF and illumination. Future understanding of the assumptions underlying human shape-from-shading computations may depend on the use of virtual, rendered stimuli in which parametric control on illumination, shape, and BRDF is possible. 
Furthermore, we found that differences between observers and between shapes are partly nonlinear. Linearity is quantified by whether the affine transformation accounts significantly for shape differences. It has previously been found that changing the illumination affects shape perception in a substantial nonlinear fashion (Nefs et al., 2005). In this study, we have shown that different BRDFs also lead to nonlinear shape perception differences, although the results are not unequivocal. We analyzed linearity of shape differences in two ways. First, we analyzed differences between observers, within stimuli (Figure 7). This comparison reveals whether interobserver differences are linear. We found that for shape 1 there is a substantial difference in linearity between the matte (75%) and velvet stimulus (43%) but not so for shape 2 (61% and 54% for the matte and velvet stimulus, respectively). Second, we compared differences within observers, between material type (Figure 8). This comparison reveals whether the perceptual differences between matte and velvet objects are linear. Here, we found that for shape 1, the differences are predominantly linear (75%), while for shape 2 the differences are predominantly nonlinear (12.5%). This may be due to geometrical differences between the two shapes. Shape 1 is a rather smooth, globular shape, while shape 2 is characterized by higher frequencies. It should be noted that affine improvement only depends on the “additive plane” (cx + dy) and not on the depth gain that is already present in the straight regression. Thus, while depth compression is present in shape 2, an additive plane does not improve the similarities. This possibly indicates that nonlinear differences in perception between objects with different BRDFs depend on the presence of high-frequency shape variations. 
Because we used stimuli with known geometry, we were able to compare the subjective reliefs with the actual reliefs. Comparison with veridicality has often been omitted in previous studies. The main reason behind this is that a picture is inherently ambiguous and it thus makes little sense to compare the perceived shape with the actual shape. This reasoning holds when the differences (if analyzed) are unsystematic. However, we found large systematic differences between the perceived and actual shapes. Observers perceived the shapes more than twice as flat as the actual shape. The relative flattening effect of velvet shapes with respect to matte shapes is a smaller effect than the overall flattening effect of perceived shapes. Large differences in depth scaling have previously been reported. Koenderink, van Doorn, and Kappers (1994) compared pictorial relief for three different viewing conditions: monocular, binocular, and synoptical. The stimuli were “flat” pictures, and no stereo information was present. They found that in comparison with monocular viewing, observers perceived the binocularly viewed relief flattened by factors between 1.37 and 2.12. Thus, binocular information of a “flat” image plane also flattens the pictorial shape. It should be noted that these were comparisons between viewing conditions and not with the veridical shape. In our experiment, observers viewed monocularly, with their heads fixed in a chin rest. Yet, it could be that the flatness of the monitor (either through perceptual cues from small head movements and accommodation cues or the observer awareness about the monitors' flatness) biased pictorial depth perception. Another reason of the overall flattening could be the absence of disparity information. It has been shown that the binocular flattening effect found by Koenderink et al. (1994) reverses when disparity information is present (Koenderink, van Doorn, & Kappers, 1995). In the latter study, the depth gain of binocular viewing with respect to monocular viewing of a real 3D object was found to be between 1 and 1.5. It is likely that our findings are the result of these two factors: the flat monitor and the absence of true binocular disparity. 
Although all stimuli appear more flat than the veridical shape, the veridicality analysis revealed that observers perceived matte shapes more veridical than the velvet ones. This could be interpreted by a better tuning of the visual system to matte objects as opposed to velvet ones. An ecological explanation of this effect could be a higher familiarity with matte objects. As we discussed earlier, the visual system seems to interpret changes in shading between the matte and velvet stimuli as changes in shape. In support of the tuning of the visual system to matte objects, we found that the ambiguity for matte shapes was substantially smaller than for velvet ones. The assumption of a canonical BRDF underlying shape-from-shading computations may explain this. 
Acknowledgments
This work was supported by a grant from the Netherlands Organization of Scientific Research (NWO). K. D. was supported by an FP7 Marie Curie IRG 239494. The authors would like to thank Nina Gaissert and Roland Fleming from the Max Planck Institute for Biological Cybernetics for their invaluable help with producing the 3D prints. 
Commercial relationships: none. 
Corresponding author: Maarten Wijntjes. 
Email: m.w.a.wijntjes@tudelft.nl. 
Address: Faculty of Industrial Design Engineering, Delft University of Technology, Landbergstraat 15, Delft 2628 CE, Netherlands. 
Footnote
Footnotes
1  In this paper, the “light field” denotes the local light field located at the position of the stimulus.
References
Adelson E. H. (2001). On seeing stuff: The perception of materials by humans and machines. Proceedings of SPIE—The International Society for Optical Engineering, 4299, 1–12.
Caniard F. Fleming R. W. (2007). Distortion in 3D shape estimation with changes in illumination. In Proceedings of the 4th Symposium on Applied Perception in Graphics and Visualization (APGV'07) (pp. 99–105). New York, NY, USA: ACM.
Christou C. G. Koenderink J. J. (1997). Light source dependence in shape-from-shading. Vision Research, 37, 1441–1449. [CrossRef] [PubMed]
Debevec P. (1998). Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques (SIGGRAPH '98) (pp. 189–198). New York, NY, USA: ACM.
Doerschner K. Boyaci H. Maloney L. T. (2010). Estimating the glossiness transfer function induced by illumination change and testing its transitivity. Journal of Vision, 10(4):8, 1–9, http://www.journalofvision.org/content/10/4/8, doi:10.1167/10.4.8. [PubMed] [Article] [CrossRef] [PubMed]
Dror R. O. Willsky A. S. Adelson E. H. (2004). Statistical characterization of real-world illumination. Journal of Vision, 4(9):11, 821–837, http://www.journalofvision.org/content/4/9/11, doi:10.1167/4.9.11. [PubMed] [Article] [CrossRef]
Ho Y.-H. Landy M. S. Maloney L. T. (2006). How direction of illumination affects visually perceived surface roughness. Journal of Vision, 6(5):8, 634–648, http://www.journalofvision.org/content/6/5/8, doi:10.1167/6.5.8. [PubMed] [Article] [CrossRef] [PubMed]
Khang B. G. Koenderink J. J. Kappers A. M. L. (2006). Perception of illumination direction in images of 3-d convex objects: Influence of surface materials and light fields. Perception, 35, 625–645. [CrossRef] [PubMed]
Khang B. G. Koenderink J. J. Kappers A. M. L. (2007). Shape-from-shading from images rendered with various surface types and light fields. Perception, 36, 1191–1213. [CrossRef] [PubMed]
Koenderink J. J. (2001). Multiple visual worlds. Perception, 30, 1–7. [CrossRef] [PubMed]
Koenderink J. J. Pont S. C. (2003). The secret of velvety skin. Machine Vision and Applications, 14, 260–268. [CrossRef]
Koenderink J. J. van Doorn A. J. Kappers A. M. L. (1992). Surface perception in pictures. Perception & Psychophysics, 52, 487–496. [CrossRef] [PubMed]
Koenderink J. J. van Doorn A. J. Kappers A. M. L. (1994). On so-called paradoxical monocular stereoscopy. Perception, 23, 583–594. [CrossRef] [PubMed]
Koenderink J. J. van Doorn A. J. Kappers A. M. L. (1995). Depth relief. Perception, 24, 115–126. [CrossRef] [PubMed]
Koenderink J. J. van Doorn A. J. Kappers A. M. L. Todd J. T. (2001). Ambiguity and the ‘mental eye’ in pictorial relief. Perception, 30, 431–448. [CrossRef] [PubMed]
Langer M. S. Bülthoff H. H. (2001). A prior for global convexity in local shape-from-shading. Perception, 30, 403–410. [CrossRef] [PubMed]
Mark J. T. Marion B. B. Hoffman D. D. (2010). Natural selection and veridical perceptions. Journal of Theoretical Biology, 266, 504–515. [CrossRef] [PubMed]
Nefs H. T. Koenderink J. J. Kappers A. M. L. (2005). The influence of illumination direction on the pictorial reliefs of Lambertian surfaces. Perception, 34, 275–287. [CrossRef] [PubMed]
Nefs H. T. Koenderink J. J. Kappers A. M. L. (2006). Shape-from-shading for matte and glossy objects. Acta Psychologica, 121, 297–316. [CrossRef] [PubMed]
Norman J. F. Todd J. T. Orban G. A. (2004). Perception of three-dimensional shape from specular highlights, deformations of shading, and other types of visual information. Psychological Science, 15, 565–570. [CrossRef] [PubMed]
Olkkonen M. Brainard D. (2010). Perceived glossiness and lightness under real-world illumination. Journal of Vision, 10(9):5, 1–19, http://www.journalofvision.org/content/10/9/5, doi:10.1167/10.9.5. [PubMed] [Article] [CrossRef] [PubMed]
Pont S. C. Koenderink J. J. (2007). Matching illumination of solid objects. Perception & Psychophysics, 69, 459–468. [CrossRef] [PubMed]
Pont S. C. Te Pas S. F. (2006). Material–illumination ambiguities and the perception of solid objects. Perception, 35, 1331–1350. [CrossRef] [PubMed]
Sun J. Perona P. (1998). Where is the sun? Nature Neuroscience, 1, 183–184. [CrossRef] [PubMed]
Todd J. T. Koenderink J. J. van Doorn A. J. Kappers A. M. L. (1996). Effects of changing viewing conditions on the perceived structure of smoothly curved surfaces. Journal of Experimental Psychology: Human Perception and Performance, 22, 695–706. [CrossRef] [PubMed]
Todd J. T. Mingolla E. (1983). Perception of surface curvature and direction of illumination from patterns of shading. Journal of Experimental Psychology: Human Perception and Performance, 9, 583–595. [CrossRef] [PubMed]
Wijntjes M. W. A. (2011). Probing pictorial relief: From experimental design to surface reconstruction. Behavior Research Methods, 1–9.
Figure 1
 
The Blue Dress (oil on panel) by Alfred Emile Stevens (1823–1906). Sterling and Francine Clark Art Institute, Williamstown, USA. The Bridgeman Art Library.
Figure 1
 
The Blue Dress (oil on panel) by Alfred Emile Stevens (1823–1906). Sterling and Francine Clark Art Institute, Williamstown, USA. The Bridgeman Art Library.
Figure 2
 
Stimuli for Experiment 1. Two different random smooth shapes that were either painted matte or flocked velvety.
Figure 2
 
Stimuli for Experiment 1. Two different random smooth shapes that were either painted matte or flocked velvety.
Figure 3
 
Illustration of the method. The task was to adjust the gauge figure probe such that it lies flat on the pictorial surfaces. This was performed in random order over all sampling points. From the data, a 3D surface was reconstructed.
Figure 3
 
Illustration of the method. The task was to adjust the gauge figure probe such that it lies flat on the pictorial surfaces. This was performed in random order over all sampling points. From the data, a 3D surface was reconstructed.
Figure 4
 
Screen captures of the matching method to manually align the 3D model (red dots) with the photos (background). On the right, a misaligned 3D model is shown for shape 2.
Figure 4
 
Screen captures of the matching method to manually align the 3D model (red dots) with the photos (background). On the right, a misaligned 3D model is shown for shape 2.
Figure 5
 
(a) Average depth gain between velvet and Lambertian. N = 8; error bars denote 95% confidence interval of the mean. (b) Correlation between depth gains for the first and second shapes.
Figure 5
 
(a) Average depth gain between velvet and Lambertian. N = 8; error bars denote 95% confidence interval of the mean. (b) Correlation between depth gains for the first and second shapes.
Figure 6
 
Perceptual consistency quantified by the adjusted coefficient of determination R 2. N = 28 (number of subject pairs); error bars denote 95% confidence interval of the mean.
Figure 6
 
Perceptual consistency quantified by the adjusted coefficient of determination R 2. N = 28 (number of subject pairs); error bars denote 95% confidence interval of the mean.
Figure 7
 
Pairwise straight and affine correlations between observers, within stimulus. Filled bars denote straight coefficients of determination; open bars denote affine coefficients of determination. Alternating gray values denote different observer pairs as also indicated by the numbers at the bottom of the bars.
Figure 7
 
Pairwise straight and affine correlations between observers, within stimulus. Filled bars denote straight coefficients of determination; open bars denote affine coefficients of determination. Alternating gray values denote different observer pairs as also indicated by the numbers at the bottom of the bars.
Figure 8
 
Straight and affine correlations within subjects, between material (matte vs. velvet).
Figure 8
 
Straight and affine correlations within subjects, between material (matte vs. velvet).
Figure 9
 
(Top) Straight regression with respect to the veridical shape. Dashed bars show the results of Figure 6 for comparison. (Bottom) The depth gains with respect to the veridical shape are shown. N = 8; error bars denote 95% confidence interval of the mean.
Figure 9
 
(Top) Straight regression with respect to the veridical shape. Dashed bars show the results of Figure 6 for comparison. (Bottom) The depth gains with respect to the veridical shape are shown. N = 8; error bars denote 95% confidence interval of the mean.
Figure 10
 
Three-dimensional visualizations of the veridical shape and the reconstructed shapes of the matte and velvet stimuli.
Figure 10
 
Three-dimensional visualizations of the veridical shape and the reconstructed shapes of the matte and velvet stimuli.
Figure 11
 
Renderings of a velvet sphere (left) compared with Lambertian spheroids of varying depth compression. The middle row represents the isophote patterns. The bottom row represents the luminance histograms.
Figure 11
 
Renderings of a velvet sphere (left) compared with Lambertian spheroids of varying depth compression. The middle row represents the isophote patterns. The bottom row represents the luminance histograms.
Figure 12
 
The original stimuli (top two rows) and the stimuli photographed with collimated illumination (bottom two rows). The last column shows the luminance histograms in which blue denotes the matte objects and purple denotes the velvet objects.
Figure 12
 
The original stimuli (top two rows) and the stimuli photographed with collimated illumination (bottom two rows). The last column shows the luminance histograms in which blue denotes the matte objects and purple denotes the velvet objects.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×