Open Access
Article  |   December 2018
Naturally glossy: Gloss perception, illumination statistics, and tone mapping
Author Affiliations
Journal of Vision December 2018, Vol.18, 4. doi:10.1167/18.13.4
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Wendy J. Adams, Gizem Kucukoglu, Michael S. Landy, Rafał K. Mantiuk; Naturally glossy: Gloss perception, illumination statistics, and tone mapping. Journal of Vision 2018;18(13):4. doi: 10.1167/18.13.4.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Recognizing materials and understanding their properties is very useful—perhaps critical—in daily life as we encounter objects and plan our interactions with them. Visually derived estimates of material properties guide where and with what force we grasp an object. However, the estimation of material properties, such as glossiness, is a classic ill-posed problem. Image cues that we rely on to estimate gloss are also affected by shape, illumination and, in visual displays, tone-mapping. Here, we focus on the latter two. We define some commonalities present in the structure of natural illumination, and determine whether manipulation of these natural “signatures” impedes gloss constancy. We manipulate the illumination field to violate statistical regularities of natural illumination, such that light comes from below, or the luminance distribution is no longer skewed. These manipulations result in errors in perceived gloss. Similarly, tone mapping has a dramatic effect on perceived gloss. However, when objects are viewed against an informative (rather than plain gray) background that reflects these manipulations, there are some improvements to gloss constancy: in particular, observers are far less susceptible to the effects of tone mapping when judging gloss. We suggest that observers are sensitive to some very simple statistics of the environment when judging gloss.

Introduction
We identify a wide range of materials (e.g., tea, brushed aluminum, wine, skin, cotton) with apparent ease. We can usually predict how an object will feel before we touch it; we are able to judge material qualities, including surface gloss from a single static image. The glossiness (shininess) of a surface can be inferred from the pattern of specular highlights, for example their brightness and sharpness (Hunter, 1937; Marlow & Anderson, 2013) and possibly image statistics such as skew (Motoyoshi, Nishida, Sharan, & Adelson, 2007; Sharan, Li, Motoyoshi, Nishida, & Adelson, 2008; but see also Anderson & Kim, 2009; Kim & Anderson, 2010). Glossy reflections are commonly modeled in computer graphics by the proportion of incoming illumination that is reflected specularly (i.e., in a mirror-like way) and how spread out/blurred these reflections are (Ward, 1992; see Figure 1a). A surface that reflects a large proportion of incoming light in a perfect, mirror-like way will have bright, sharp specular highlights. Therefore, these are valid cues to gloss. 
Figure 1
 
(a) Simple reflectance functions, showing variation in specularity and micro-roughness. (b) The ill-posed problem of estimating reflectance: Image changes can be caused by changes in reflectance, illumination, and/or tone-mapping. In each image one property (illumination, reflectance, or tone mapping) was modified while the other two were kept constant.
Figure 1
 
(a) Simple reflectance functions, showing variation in specularity and micro-roughness. (b) The ill-posed problem of estimating reflectance: Image changes can be caused by changes in reflectance, illumination, and/or tone-mapping. In each image one property (illumination, reflectance, or tone mapping) was modified while the other two were kept constant.
Unfortunately for the observer, other factors also affect these gloss cues. The pattern of specular highlights varies according to the shape of the reflecting surface and the pattern of incoming illumination. For example, specular highlights are more spatially compressed (and thus brighter) at regions of high curvature. In addition, highlights will be brighter when a greater proportion of the incoming illumination is directional, rather than ambient (e.g., on a sunny day). For humans to achieve gloss constancy, they should estimate not only an object's reflectance, but also estimate (and compensate for) object shape and the illumination. When viewing a glossy nectarine at a sunny picnic, we don't want to be confused when the sun goes behind thick clouds, and accuse our copicnickers of switching it for a matte peach (Landy, 2007). 
To complicate the task further, most images that we view in print or via a screen have been tone mapped to accommodate the limited range of available luminance values. Luminance contrast in real scenes considerably outstrips that which can be displayed on a standard monitor. Displayed images are, therefore, usually nonlinearly transformed in the luminance domain, often by compressing the distribution at the top and bottom ends, as in the commonly used sigmoidal tone-mapping function (Tumblin, Hodgkins, & Guenter, 1999). 
In summary therefore, and as illustrated in Figure 1b, changes to specular highlights (the cues used to judge gloss) can be caused by changes in any/all of the following: (a) reflectance, (b) illumination, (c) shape, and (d) image manipulations such as tone mapping.1 To separate out the confounded effects of these variables, the observer must rely on information about these variables from the image and/or prior knowledge about their probable values. 
Previous work suggests that in the absence of information about illumination (i.e., when an object is viewed in isolation), observers estimate gloss as though relying on prior assumptions about the illumination structure (e.g., that it contains strong edges). Fleming, Dror, and Adelson (2003) presented glossy spheres that had been rendered under various artificial illumination fields. For highly unnatural illumination (e.g., 1/f noise), observers perceived the resultant images to be matte. In a sense this is unsurprising: We know that observers' estimates of gloss increase with the brightness, sharpness, and coverage of specular highlights (Hunter, 1937; Marlow & Anderson, 2013; Marlow, Kim, & Anderson, 2012), and so when sharp, bright highlights are not visible in the image, the object no longer appears glossy. In the experiments of Fleming et al., observers were given no information about the illumination used to render the images: The object's image was pasted over an arbitrary checkered background. When stimuli do not contain information about the current illumination, observers only have their prior knowledge of illumination to guide gloss perception. 
In a similar vein, Olkkonen and Brainard (2010) and Pont and te Pas (2006) found that while observers are able to match the specular reflectance of objects rendered under the same illumination, they show failures in gloss constancy when comparing objects rendered under different illumination fields. In these experiments, stimuli were presented against arbitrary (uninformative) backgrounds, either in isolation or alongside other objects, which could arguably provide weak/indirect information about the spatial pattern of illumination. 
Normal viewing situations usually provide observers with some information about the current illumination, which could be used to optimize gloss judgments. When such information is available, do observers make use of it, to achieve partial or full gloss constancy over illumination changes? 
Motoyoshi and Matoba (2012) presented scenes depicting a complex object (a statue) in a room amongst other objects, and varied the illumination. They found failures in gloss constancy when observers were asked to match surface reflectance across different illumination environments. In addition, manipulating the object's background (varying contrast or gamma) had no effect on perceived gloss. In other words, observers failed to use the object's context to compensate for illumination changes and achieve gloss constancy. The authors propose that observers use simple image statistics from within the object's image to infer gloss, and ignore the background. A few factors should be kept in mind when interpreting these results: First, the same object (albeit from a slightly rotated viewpoint) was compared across scenes, which may have encouraged image matching, rather than gloss matching. Second, the outdoor illumination fields used for rendering were inconsistent with the indoor scene presented, and third, all stimuli were presented in grayscale, which may have limited the observers' abilities to segment specular from diffuse reflectance. 
Doerschner, Maloney, and Boyaci (2010) directly tested the effect of the mean luminance of an object's context using real scenes and found that this can affect perceived gloss. Observers viewed real spheres (either matte black with a white painted dot, or glossy black) in front of a black or a white background. Objects were perceived as somewhat glossier when presented against a black, rather than a white background. Similarly, observers report that the central region of a surface appears glossier when the surrounding region is darker (Hansmann-Roth & Mamassian, 2017). 
In summary, when information about the illumination is unavailable, observers must (and do) rely on prior assumptions about the illumination structure. Some authors suggest that even when cues to the current illumination are available, they are not used. Instead, we rely on certain characteristics of natural illumination environments such as dynamic range, skew, distribution of wavelet coefficients, and the dominant direction of illumination (Fleming et al., 2003; Motoyoshi & Matoba, 2012). In Bayesian terms, this reliance would amount to giving all weight to the prior, and none to the available information about the current illumination. For this approach to be successful (i.e., to result in accurate, stable estimates of gloss), illumination would need to be invariant in certain dimensions: those that affect the cues that we use to infer gloss. In this case, observers could estimate gloss directly from the object's image (ignoring the context) and gloss constancy would come for free, without any compensatory mechanism. 
Unfortunately, natural illumination varies in ways that do affect the image cues used to judge gloss (Motoyoshi & Matoba, 2012). Thus, models of gloss perception that ignore this variation predict that our perception of an object's material will vary across different natural illumination fields (as well as across artificially manipulated ones). In other words, we would fail to show gloss constancy. 
In addition to changes in illumination, tone mapping can also alter the luminance and contrast of specular highlights. Studies of gloss perception often use a sigmoidal tone mapping (Fleming et al., 2003; Pellacini, Ferwerda, & Greenberg, 2000; Tumblin et al., 1999; Wills, Agarwal, Kriegman, & Belongie, 2009), or similar compression of high intensities (Marlow & Anderson, 2013; Marlow et al., 2012; Motoyoshi & Matoba, 2012), in order to present images on a standard monitor. Phillips, Ferwerda, and Luka (2009) used a high-dynamic-range (HDR) display to compare the perceived gloss of stimuli with and without a sigmoidal tone mapping. Tone-mapped stimuli were perceived to be substantially less glossy than HDR stimuli. Similarly to Fleming et al. (2003), stimuli were cropped out of the illumination environment used for rendering, and presented against an arbitrary background. An interesting question thus remains: Would observers be gloss constant over tone-mapping manipulations, if more contextual image information were available? 
We can compare gloss constancy with lightness and color constancy. Well-known demonstrations of lightness and color constancy show that the perceived hue or lightness of a surface patch is strongly affected by the context in which it is viewed (Chevreul, 1855; Mollon, 1987). Despite these contextual effects, we also rely on prior assumptions about illumination when judging lightness: In the absence of explicit illumination information, observers assume that light is coming from above when judging the reflectance of a surface (Adams, 2007; Adams, Graf, & Ernst, 2004; Mamassian & Landy, 2001). As information about the current illumination increases, lightness constancy improves. Snyder, Doerschner and Maloney (2005) asked observers to judge the albedo of a surface patch within a 3D scene. Lightness constancy improved when specular spheres (which provided information about the illumination context) were added to the scene. Thus, for lightness judgments, observers act in a Bayesian manner, combining prior and current information about the illumination to optimize perception and improve constancy. In contrast to the extensive work on color and lightness constancy, relatively little is known about how (or the extent to which) observers combine prior knowledge about illumination with online sensory cues to achieve gloss constancy. 
Research questions
Here we investigate a set of related issues: First, we analyze the structure of a diverse set of natural illumination environments from the Southampton-York Natural Scenes (SYNS) dataset (Adams, Elder, et al., 2016) to investigate the statistical structure of natural illumination. Second, we render objects under natural and manipulated illumination environments to determine the effects on perceived gloss when statistical regularities are violated. Third, we investigate the effects of tone mapping on perceived gloss. Finally, we ask whether observers can exploit contextual information within an image to compensate for changes in illumination, or tone mapping, in order to achieve gloss constancy. 
To preview our key results: 
  •  
    We identify a number of characteristics of natural illumination that vary little across diverse natural scenes: the luminance distribution (highly skewed), the distribution of luminance contrast across frequencies (1/f x), and the positive relationship between luminance and elevation (i.e., light from above).
  •  
    Manipulating two of these three characteristics affected perceived gloss: the luminance distribution and the dominant illumination direction. Providing explicit information about illumination, by presenting objects within their true environment, did have some effect on perceived gloss, but did not lead to gloss constancy.
  •  
    Conversely, tone mapping had a substantial effect on perceived gloss, but only when objects were presented in isolation; when the whole image was present, tone mapping had a much smaller effect on perceived gloss.
In summary, observers are disappointingly vulnerable to biases in perceived gloss when substantial, salient changes are made to the illumination. However, as long as an object is viewed within the context of a larger image, tone-mapping has only a small effect on gloss judgments; some compensation occurs such that (near) gloss constancy is achieved. 
Light-field analyses
The SYNS dataset includes HDR spherical illumination maps. The current analyses included 72 of these light-fields that were sampled at unique locations across Hampshire, UK, within a diverse array of scene categories (20 outdoor categories, six indoor categories). Each HDR light-field is 5392 × 2696 pixels; Figure 2 shows some examples. 
Figure 2
 
Eight example light-fields from the SYNS dataset. Some light-fields that were captured in full sun included an artifact below the sun—a bright narrow vertical strip—this has been removed by replacing the affected part of the image with corresponding pixels from a second image, captured a few minutes later.
Figure 2
 
Eight example light-fields from the SYNS dataset. Some light-fields that were captured in full sun included an artifact below the sun—a bright narrow vertical strip—this has been removed by replacing the affected part of the image with corresponding pixels from a second image, captured a few minutes later.
Previous studies have also explored the statistical characteristics of natural illumination. Dror, Willsky, and Adelson (2004) analyzed 104 high-dynamic-range (HDR) environments [95 images from Teller et al. (2003) and nine images from Debevec and Malik (1997)] of real-world scenes and found strong statistical regularities. For example, most pixels in each light-field were low luminance, with a few very high luminance points due to small bright sources. In addition, illumination increased with elevation (although most images were restricted to elevations above 70°, where 0° is straight down), wavelet coefficient distributions were kurtotic, and neighboring pixels were highly correlated in their luminance values. 
Mury, Pont, and Koenderink (2009) built a custom “plenopter,” 12 HDR, large-field-of-view sensors arranged in a dodecahedron configuration. This allowed them to record light-fields in low resolution (up to second-order spherical harmonics). Similarly, Morgenstern, Geisler, and Murray (2014) built a multidirectional photometer consisting of 64 evenly spaced photodiodes to collect 570 (relatively low-pass) light-fields. Their analyses suggested that illumination fields are relatively diffuse, and that observers' reflectance estimates across changes in surface orientation are consistent with an assumption of similarly diffuse illumination. 
The current analyses extend previous work by using the SYNS high-resolution, spherical illumination maps, sampled in a principled manner from diverse scene categories. 
Luminance distribution
One of the simplest statistics we can consider is the distribution of luminance across the light-field, summarized by the luminance histogram (Figure 3, left column). We confirm previous assertions that natural illumination tends to be skewed (note the logarithmic x axis): there are a few very bright pixels. 
Figure 3
 
Luminance distributions (left) and luminance as a function of elevation (right). The luminance scale is arbitrary. Rows represent light-fields from indoor (upper), outdoor cloudy (middle), and outdoor sunny scenes (lower). Each gray line corresponds to a single scene; black lines show the median, and green lines show the 10th and 90th percentiles. Note that our analyses used 2D images created via equirectangular projection—each pixel covers the same angular range in elevation (θ) and azimuth, rather than equal solid angles on the view sphere. To compensate for this, frequency was weighted by sin(θ), where θ = 0° is downward and θ = 90° is towards the horizon.
Figure 3
 
Luminance distributions (left) and luminance as a function of elevation (right). The luminance scale is arbitrary. Rows represent light-fields from indoor (upper), outdoor cloudy (middle), and outdoor sunny scenes (lower). Each gray line corresponds to a single scene; black lines show the median, and green lines show the 10th and 90th percentiles. Note that our analyses used 2D images created via equirectangular projection—each pixel covers the same angular range in elevation (θ) and azimuth, rather than equal solid angles on the view sphere. To compensate for this, frequency was weighted by sin(θ), where θ = 0° is downward and θ = 90° is towards the horizon.
Luminance-elevation relationship
In agreement with previous work (Dror et al., 2004) we find that, broadly speaking, luminance increases with elevation. However, for outdoor scenes, there is a local minimum at the horizon (90°), probably because vertical surfaces such as trees and walls of buildings are more prevalent at this elevation (Adams, Elder, et al., 2016) and these surfaces are unlikely to face the dominant illumination direction. This effect is especially pronounced for scenes captured during sunny, rather than overcast conditions (Figure 3, right column). 
Energy as a function of spatial frequency
It is well known that natural images have a Display Formula\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\unicode[Times]{x1D6C2}}\)\(\def\bupbeta{\unicode[Times]{x1D6C3}}\)\(\def\bupgamma{\unicode[Times]{x1D6C4}}\)\(\def\bupdelta{\unicode[Times]{x1D6C5}}\)\(\def\bupepsilon{\unicode[Times]{x1D6C6}}\)\(\def\bupvarepsilon{\unicode[Times]{x1D6DC}}\)\(\def\bupzeta{\unicode[Times]{x1D6C7}}\)\(\def\bupeta{\unicode[Times]{x1D6C8}}\)\(\def\buptheta{\unicode[Times]{x1D6C9}}\)\(\def\bupiota{\unicode[Times]{x1D6CA}}\)\(\def\bupkappa{\unicode[Times]{x1D6CB}}\)\(\def\buplambda{\unicode[Times]{x1D6CC}}\)\(\def\bupmu{\unicode[Times]{x1D6CD}}\)\(\def\bupnu{\unicode[Times]{x1D6CE}}\)\(\def\bupxi{\unicode[Times]{x1D6CF}}\)\(\def\bupomicron{\unicode[Times]{x1D6D0}}\)\(\def\buppi{\unicode[Times]{x1D6D1}}\)\(\def\buprho{\unicode[Times]{x1D6D2}}\)\(\def\bupsigma{\unicode[Times]{x1D6D4}}\)\(\def\buptau{\unicode[Times]{x1D6D5}}\)\(\def\bupupsilon{\unicode[Times]{x1D6D6}}\)\(\def\bupphi{\unicode[Times]{x1D6D7}}\)\(\def\bupchi{\unicode[Times]{x1D6D8}}\)\(\def\buppsy{\unicode[Times]{x1D6D9}}\)\(\def\bupomega{\unicode[Times]{x1D6DA}}\)\(\def\bupvartheta{\unicode[Times]{x1D6DD}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bUpsilon{\bf{\Upsilon}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\(\def\iGamma{\unicode[Times]{x1D6E4}}\)\(\def\iDelta{\unicode[Times]{x1D6E5}}\)\(\def\iTheta{\unicode[Times]{x1D6E9}}\)\(\def\iLambda{\unicode[Times]{x1D6EC}}\)\(\def\iXi{\unicode[Times]{x1D6EF}}\)\(\def\iPi{\unicode[Times]{x1D6F1}}\)\(\def\iSigma{\unicode[Times]{x1D6F4}}\)\(\def\iUpsilon{\unicode[Times]{x1D6F6}}\)\(\def\iPhi{\unicode[Times]{x1D6F7}}\)\(\def\iPsi{\unicode[Times]{x1D6F9}}\)\(\def\iOmega{\unicode[Times]{x1D6FA}}\)\(\def\biGamma{\unicode[Times]{x1D71E}}\)\(\def\biDelta{\unicode[Times]{x1D71F}}\)\(\def\biTheta{\unicode[Times]{x1D723}}\)\(\def\biLambda{\unicode[Times]{x1D726}}\)\(\def\biXi{\unicode[Times]{x1D729}}\)\(\def\biPi{\unicode[Times]{x1D72B}}\)\(\def\biSigma{\unicode[Times]{x1D72E}}\)\(\def\biUpsilon{\unicode[Times]{x1D730}}\)\(\def\biPhi{\unicode[Times]{x1D731}}\)\(\def\biPsi{\unicode[Times]{x1D733}}\)\(\def\biOmega{\unicode[Times]{x1D734}}\)\(1/{f^x}\) spectral power distribution; contrast energy decreases with increasing spatial frequency (Field, 1987; Ruderman, 1994; Tolhurst, Tadmor, & Chao, 1992). We can analyze the power distribution within a spherical image (the light-field) in an analogous way using spherical harmonics. For our spherical harmonic analyses we used the s2kit toolbox (Healy, Rockmore, Kostelec, & Moore, 2003; Kostelec, Maslen, Healy, & Rockmore, 2000) for MATLAB (MathWorks, Natick, MA). 
Luminance contrast decreases with increasing spherical harmonic order (increasing frequency in angular terms). This follows a roughly Display Formula\(1/{f^x}\) distribution for outdoor cloudy scenes. For indoor, or sunny outdoor scenes, in which light sources (light bulbs/the sun) are visible in the image, there is more energy at relatively high frequencies. As noted previously (Dror et al., 2004) the relationship becomes linear in log-log coordinates if we instead analyze the log luminance of the light-field (Figure 4, right column). The mean slope in this log-log relationship, across our light-fields is −2.45. 
Figure 4
 
Spectral power distribution. The averaged squared coefficients per order, as a function of spherical harmonic order in luminance (left column) or log luminance (right column).
Figure 4
 
Spectral power distribution. The averaged squared coefficients per order, as a function of spherical harmonic order in luminance (left column) or log luminance (right column).
We ask whether observers rely on these fairly ubiquitous characteristics of illumination fields when estimating surface gloss. In other words, do we interpret specular highlights in a way that implicitly assumes that illumination fields are highly skewed, with predominantly overhead illumination and a typical Display Formula\(1/{f^x}\) power distribution? Moreover, when information about the illumination field is available, can observers maintain gloss constancy when the illumination deviates from these characteristics? Experiment 1 investigates these questions: We render glossy objects under natural and manipulated illumination fields and measure the effect on gloss perception. 
Experiment 1: Gloss constancy across changes in illumination
Methods
Experimental stimuli
Eight outdoor light-fields were selected from the SYNS dataset (as shown in Figure 2) to represent a range of scene categories and weather conditions (sunny vs. overcast). The light-fields were manipulated using MATLAB (MathWorks) and the S2kit spherical harmonic toolbox (Healy et al., 2003; Kostelec et al., 2000) to create the three different manipulation conditions shown in Figure 5. As noted above, natural light-fields tend to be highly skewed, with just a few bright pixels, and previous researchers have suggested that skew may contribute to perceived gloss (Motoyoshi et al., 2007). For our uniform condition, we manipulated each light-field to have a uniform luminance distribution while preserving the spherical harmonic power spectrum. To this end, we followed an iterative process inspired by Heeger and Bergen's (1995) procedure for texture synthesis. First, luminance values were adjusted (preserving rank order) to create a uniform luminance distribution. Then, spherical harmonic power at each order was reset to its original value. These two steps were repeated seven times, after which the image changes were minimal. 
Figure 5
 
Above: an example light-field in its original form (standard), and after the three different illumination manipulations. Below: example stimuli, rendered under standard illumination, to illustrate the nine different stimulus gloss levels.
Figure 5
 
Above: an example light-field in its original form (standard), and after the three different illumination manipulations. Below: example stimuli, rendered under standard illumination, to illustrate the nine different stimulus gloss levels.
For our half-slope condition, the power distribution was manipulated to boost contrast at high, relative to low angular frequencies, while preserving the luminance histogram. This was (similarly to the uniform condition) achieved via an iterative process—alternately adjusting the power distribution from Display Formula\(1/{f^x}\) to Display Formula\(1/{f^{x/2}}\)—and then adjusting the luminance histogram to match the original luminance distribution. 
Finally, as noted above, illumination increases with elevation in natural scenes, and we know that humans (and chickens) assume overhead illumination when estimating shape or reflectance from ambiguous shaded images (Adams, 2007; Hershberger, 1970; Mamassian & Landy, 2001). We investigated the role of overhead illumination in gloss perception by reversing the direction of the first directional spherical harmonic component, to create our Halloween condition. 
Manipulations were applied to the luminance dimension, while preserving hue—images were first converted from RGB to HSV using the MATLAB function rgb2hsv. Following manipulations, light-fields were linearly scaled such that all light-fields, across all conditions, matched in mean luminance. Finally, the original hue was restored before reverting to RGB format. 
The stimulus objects were spheres, modulated with random noise to create different blobby “potato” shapes. Complex and varying shapes were used instead of spheres to encourage subjects to judge gloss, rather than performing simple image comparisons. Stimuli were rendered as though impaled above a transparent pedestal to enhance the impression that the virtual object was embedded within the scene. Examples are shown in Figure 6
Figure 6
 
Stimuli rendered under standard, uniform, half-slope, and Halloween light-fields. On the left, moderately glossy stimuli (level 3) are presented with the background visible. On the right, high-gloss stimuli (level 7) are presented against a gray background, rather than the true (rendering) context. Below, the schematic shows the slow reveal used in Experiment 1.
Figure 6
 
Stimuli rendered under standard, uniform, half-slope, and Halloween light-fields. On the left, moderately glossy stimuli (level 3) are presented with the background visible. On the right, high-gloss stimuli (level 7) are presented against a gray background, rather than the true (rendering) context. Below, the schematic shows the slow reveal used in Experiment 1.
Software and apparatus
Stimuli were prerendered using Octane Render (Version 1.55, Otoy Inc.), a ray-trace renderer, under eight different light-fields, with each of the four illumination conditions: standard (no manipulation), uniform, half-slope, and Halloween. Stimuli were rendered with nine gloss levels, defined by the parameters of the renderer that correspond to (a) specular strength, i.e., the proportion of light reflected specularly, which varied linearly from 0.2 to 1, and (b) micro roughness (the degree of specular scatter), which varied from 0.25 to 0.01. The diffuse parameter was fixed at 0.2. See Figure 5 for examples of the nine gloss levels, rendered under standard illumination. 
The experimental software was written in MATLAB using Psychophysics Toolbox Version 3 (Kleiner et al., 2007; Pelli, 1997). The experiment was carried out at two locations: New York University and the University of Southampton. At New York University, the experiment ran on a Mac Pro with 2.66 GHz Dual Core Intel Xeon equipped with an NVIDIA GeForce 7300GT graphics card. The display used for the experiment was a Dell P780 CRT monitor with a resolution of 1,024 × 768. At the University of Southampton, the experiment ran on a 27-in. iMac (3.2 GHz Intel Core i5) equipped with an NVIDIA GeForce GT 755M 1024 MB graphics card. The display had a resolution of 2,560 × 1,440. 
The luminance response (gamma) was measured for the monitors in each location. All stimuli were tone-mapped using the same function (Equation 1) and then, inverse gamma corrected separately for each monitor to maintain linearity when displayed. We prefer this approach over adjusting the monitor to have a linear output, since the latter produces visibly discretized luminance at lower levels.  
\begin{equation}\tag{1}f(x) = \left\{ {\matrix{ {170x/c,\quad x \lt c} \cr {{{255} \over {1 + \exp \left( { - k{{x - {x_0}} \over c}} \right)}},\quad x \ge c} \cr } } \right. \end{equation}
 
The tone mapping was designed such that intensities up to the 99th percentile of all pixel values (across all stimuli) were linearly scaled. Intensities (x) below this cut off, c (7, arbitrary units), were linearly mapped to the lower 2/3 of the available luminance range, i.e., (0, 170). Intensities above c were nonlinearly mapped as shown in Equation 1 to the upper 1/3 of the available range, i.e., (170, 255). To give a smooth and continuous function (no discontinuity in luminance or slope), Display Formula\(k = \log (2)/\left( {1 - \left( {{x_0}/c} \right)} \right)\) and Display Formula\({x_0} = c\left( {1 - \log (2)/3} \right)\)
Observers were seated 55 cm from the display and viewed the stimuli monocularly with their head stabilized via a chin-rest. Stimuli were 36° × 43.75°, with the glossy object subtending approximately 8° of visual angle. 
Observers
Twelve observers completed the study: six at New York University, six at the University of Southampton. All observers, except for GK (one of the authors), were unaware of the purposes of the experiment. All observers had normal or corrected-to-normal vision. Methods were approved by the New York University Committee on Activities Involving Human Subjects and the University of Southampton Ethics Committee. Subjects gave informed consent prior to testing. 
Procedure and data analysis
On each trial, observers made a two-interval forced-choice decision to report (via key press) which of two sequential stimuli was “made out of a glossier material.” Each stimulus was displayed for 3 s, separated by a brief (0.1 s) blank screen. Each stimulus was gradually revealed within the first 1 s of display time (Figure 6) to encourage the observers to attend to the whole scene (rather than only comparing the central objects). Following stimulus offset, a response screen displayed a response prompt with a reminder of the response keys. Observers were given unlimited time to respond and were not given any feedback. 
Within each trial, both objects were rendered under the same light-field but could differ in light-field manipulation and/or gloss level. On the majority of the trials (3024 of 3114 total trials), stimuli were either both presented in context, i.e., with a portion of the rendering light-field visible in the background (background-present), or both presented against a gray background (background-absent). In the latter case, the luminance of the background matched the mean luminance of all backgrounds for the background-present stimuli. On a smaller number of trials (90), one stimulus was shown with and the other was shown without its background. When the two stimuli shared a common manipulation condition, their gloss levels differed by 1 to 3. In other cases they differed by 0 to 4. For example, a Halloween stimulus of gloss level 1 was compared to gloss levels 2, 3, and 4 within the Halloween condition. Stimulus comparisons were selected to maximize informative trials (i.e., those in which some confusability was expected, such as when the two objects had similar gloss values) and to allow the perceived glossiness of all stimuli to be quantified on a common scale (i.e., all stimuli were directly or indirectly compared, and there were no disconnected subsets of stimuli in which one subset of stimuli are always judged as less glossy than another subset). Stimulus order was randomized within and across trials, to avoid any effect of interval bias (Yeshurun, Carrasco, & Maloney, 2008). 
Data analyses followed Thurstone Case V scaling (Thurstone, 1927). In this scaling procedure, each stimulus condition (combination of illumination manipulation, context presence/absence, and gloss level) is treated as having a unidimensional mean perceived gloss. Individual stimulus presentations (which varied in terms of the shape of the target object, and the light field identity) were assumed to invoke perceived gloss given by the condition mean perturbed by Gaussian noise (for the software, see https://github.com/mantiuk/pwcmp). Thus, data were pooled across light field identity and object shape within each condition. 
For Thurstonian scaling, for every possible pair of conditions (e.g., A and B), the number of trials in which stimuli from condition A are reported as glossier than those from B is recorded alongside the total number of A, B comparisons. Observing a large proportion of “A is glossier” responses is more probable if the mean perceived gloss of A is greater than that of B. In addition, the relationship between conditions is assumed to be transitive, i.e., if A > B and B > C, then A > C. Mean perceived gloss values (in relative JNDs) were fit simultaneously to all conditions (9 gloss levels × 4 manipulations × 2: background present/absent) for each observer via maximum-likelihood estimation. 
Results
First, we consider the effects of the three different light-field manipulations on perceived gloss, in the absence of information about the prevailing illumination (Figure 7, top row). Our light-field manipulations had a significant effect on perceived glossiness: two-factor ANOVA, with gloss level and illumination manipulation as predictors, main effect of manipulation, F(3, 33) = 40.8, p < 0.001; all ANOVA analyses and posthoc comparisons were performed using SPSS. Stimuli rendered under uniform light-fields were perceived as significantly less glossy than those under natural illumination (p < 0.001), whereas the decrease in perceived gloss under the Halloween condition did not reach significance following Bonferroni correction for multiple comparisons (p > 0.05). Our data suggest that observers have internalized certain characteristics of natural light-fields (luminance skew, and possibly illumination from above) and rely on these for gloss estimation. 
Figure 7
 
Data from Experiment 1 for the background-absent (top row) and background-present (bottom row) conditions. First column: Example data for one naïve observer. Second column: Group data, averaged across all observers. Third column: Summary data: mean perceived gloss for each illumination manipulation, averaged across gloss levels. Fourth column: Sensitivity, defined as the difference in perceived gloss, in JND units, between the most glossy and most matte stimuli. Error bars: ±1 SE across observers. Curves are 2nd-order polynomial fits to the data.
Figure 7
 
Data from Experiment 1 for the background-absent (top row) and background-present (bottom row) conditions. First column: Example data for one naïve observer. Second column: Group data, averaged across all observers. Third column: Summary data: mean perceived gloss for each illumination manipulation, averaged across gloss levels. Fourth column: Sensitivity, defined as the difference in perceived gloss, in JND units, between the most glossy and most matte stimuli. Error bars: ±1 SE across observers. Curves are 2nd-order polynomial fits to the data.
There was no significant effect of manipulating the power distribution (half-slope condition) on perceived gloss in the background-absent condition. Under this manipulation, the luminance distribution was held fixed while varying the spectral slope. Thus, although natural illumination fields differ very little in their distribution of spectral power across frequencies, we don't seem to rely on this characteristic when estimating gloss. In contrast, perceived gloss is affected when there is a change in the predominant illumination direction, or the luminance distribution (e.g., its skew). 
Recall that we manipulated the illumination field rather than directly manipulating the stimulus images. Next, we consider whether any simple image statistics might explain the reduction in perceived gloss under the Halloween and uniform conditions. The effects of our manipulations on simple statistics of the luminance distribution within the image of the judged object are presented in Figure 8. A linear regression reveals that, in the background-absent conditions, variations in perceived gloss are well approximated by a combination of Michelson contrast and skew. This simple model explains 86% of variance in perceived gloss, across the 36 conditions (9 stimulus gloss levels and 4 illumination manipulations). One predictor alone (contrast, maximum, or range) can explain only around 70% of variance; the addition of a second parameter is supported by leave-one-out cross-validation. Adding a third predictor scarcely helps (producing an increase in r2 of less than 1%). 
Figure 8
 
Simple statistics characterizing the luminance distribution within the object's image (top row) or the background context (bottom row), averaged across light fields and random variations in object shape.
Figure 8
 
Simple statistics characterizing the luminance distribution within the object's image (top row) or the background context (bottom row), averaged across light fields and random variations in object shape.
Our primary question, however, is whether observers can use information about the illumination field, when available, to improve gloss constancy. If observers completely ignore the context that the object is presented in, the data would be identical for the background-absent and background-present conditions. In contrast, if observers use the background information to compensate for the current illumination conditions, i.e., to improve gloss constancy, we would expect perceived gloss to become more similar across the four manipulation conditions in the background-present conditions. However, when the background was present (Figure 7, lower row), the illumination manipulations still had a significant effect on perceived gloss: two-factor ANOVA, with gloss level and illumination manipulation as predictors, main effect of manipulation, F(3, 33) = 40.8, p < 0.001; stimuli rendered under any of the three manipulated illuminations were perceived as significantly less glossy than those rendered under the standard illumination (all p < 0.05 following Bonferroni correction). 
Importantly, we can compare perceived gloss across the background-present and background-absent conditions to see whether contextual information is used when judging perceived gloss. There was a significant interaction between background presence/absence and light-field manipulation: three-factor ANOVA, with gloss level, illumination manipulation, and background presence as predictors; interaction between manipulation and presence, F(3, 33) = 47.5, p < 0.001, suggesting that contextual information from the illumination environment does have an effect on gloss perception. However, it does not, as we hypothesized, improve gloss constancy. When the background is present, stimuli rendered under the half-slope condition are perceived as less glossy than those rendered under a normal light-field (and less glossy than when the same stimuli were presented against an arbitrary gray background). This can be thought of as a failure/reduction of gloss constancy. Why would the presence of the background reduce perceived gloss in the half-slope condition? Changing the slope of the power spectrum (while preserving the luminance histogram) produced multiple small bright spots within the visible background; the contrast of the background was increased (relative to standard illumination), whereas the contrast within the object region was reduced. 
A parsimonious explanation of both the background-present and background-absent conditions is thus that observers are sensitive not only to contrast and skew within the object's image, but how these characteristics compare to the luminance profile of the background (Figure 8). This makes sense: high-contrast illumination fields with large skew (i.e., a few very bright regions) should lead to high-contrast and bright specular reflections on glossy objects. In line with this, perceived gloss across all 72 conditions (background present/absent; four illumination manipulations, nine gloss levels) is well explained by positive effects of contrast and skew within the object's image, combined with suppressive effects of contrast and skew within the background (r2 = 0.87 from linear regression; the addition of the two background parameters is supported by leave-one-out cross-validation). 
In the half-slope condition, skew is increased, but contrast is decreased within the image of the target object (relative to the standard condition). Accordingly, there is little change in perceived gloss following this manipulation in the background-absent condition. However, both contrast and skew are substantially elevated within the background, and this is accompanied by a decrease in perceived gloss in the background-present condition. In the Halloween condition, skew within the object's image is similar to that of the standard stimulus whereas contrast is decreased. This combination is accompanied by a decrease in perceived gloss in the background-absent condition. Within the background, skew is increased relative to the standard condition, and Michelson contrast is approximately unchanged. Accordingly, when the background is present in the Halloween condition, there is a small reduction in perceived gloss (relative to background-absent). In the uniform condition, large decreases in skew and contrast within the object's image are accompanied by a large decrease in perceived gloss. Notably, however, the reduced contrast and skew in the background for this condition do not produce an increase in perceived gloss in the uniform, background-present condition (relative to the background-absent condition). It is possible that there is an asymmetry in the effects of contrast and skew within the surrounding context of a viewed object: Unusually large values have a suppressive effect on perceived gloss; however, a reduction in the highlight-inducing features of the background has little effect on perceived gloss. In fact, a similar observation can be made in relation to the effect of a uniform background in the background-absent conditions: A zero-contrast background does not inflate perceived gloss, as one might otherwise expect. 
Finally, we investigated whether the presence of the background improves gloss discrimination, indexed by the range of perceived gloss values in JND space for each condition (Figure 7, fourth column). Background presence might improve gloss discrimination by giving observers explicit information about the illumination, which could serve a useful reference to correctly interpret specular highlights. To this end, we independently analyzed data from the background-present and absent conditions. However, we did not find a significant difference in gloss discriminability when a portion of the illumination was visible in the background versus when it was absent: two-factor ANOVA with manipulation and presence/absence as predictors; nonsignificant main effect of presence/absence, F(1, 11) = 0.85, p > 0.05. Interestingly, however, discrimination did depend on the illumination manipulation: main effect of illumination, F(3, 33) = 13.4, p < 0.001. Perceived gloss increased more rapidly with stimulus gloss for the half-slope condition than under standard illumination (p < 0.01, following Bonferroni correction). This is consistent with the influence of luminance skew on gloss perception; skew increases more dramatically with stimulus gloss in this condition (Figure 8, top-right plot). There was no effect of interval bias apparent in observer responses: Observers were equally likely to select the stimulus with a higher gloss level as glossier when it appeared in the first or second interval. For both intervals, this response probability was 0.78. 
Experiment 2: Gloss constancy across tone mapping
As noted above, we often view images that have been tone mapped (compressed in the luminance domain) to allow presentation on a standard monitor. One might expect this manipulation to affect perceived gloss, given that tone mapping will reduce image contrast (and the brightness of specular highlights). Here, we ask whether tone mapping does indeed have an effect on perceived gloss, and whether this effect is minimized when contextual information is provided. 
Methods
The methods and experimental stimuli were broadly similar to Experiment 1, with notable exceptions: We used only two illumination conditions, either standard or tone mapped. Importantly, all stimuli were presented on a custom-built, high-dynamic-range display allowing stimuli to be displayed with luminance ranging from 0.01 to 5,000 cd/m2. The projector-based HDR display (Seetzen et al., 2004; Wanat, Petit, & Mantiuk, 2012) consisted of a 3,000 lumen 1,024 × 768 DLP projector with the color wheel removed, acting as a backlight, and a 9.7-in. 2,048 × 1,536 LCD panel from an iPad 3, from which we removed the backlight. The geometric transformation required to align the two displays was found with the help of a camera. The display was calibrated to reproduce the rec.709/sRGB color gamut. 
We used a standard sigmoidal tone-mapping function (Tumblin et al., 1999) to mirror that used in many previous psychophysical studies. The function, shown in Figure 9a, reduced the dynamic range of the input image to the luminance range 1 to 100 cd/m2. Luminance was approximately preserved across the lower range, but smoothly compressed at the higher end. Although modern monitors can achieve minimum luminance below 1 cd/m2, our elevated “black level” simulated screen reflections due to ambient light. 
Figure 9
 
Experiment 2. (a) Tone-mapping function used to map high dynamic values to a simulated low-dynamic-range display (blue line). The identity function (slope 1) is shown as a dashed black line for a reference. (b) Perceived gloss of stimuli with or without tone mapping when presented without the background, and (c) with the background present.
Figure 9
 
Experiment 2. (a) Tone-mapping function used to map high dynamic values to a simulated low-dynamic-range display (blue line). The identity function (slope 1) is shown as a dashed black line for a reference. (b) Perceived gloss of stimuli with or without tone mapping when presented without the background, and (c) with the background present.
Similarly to Experiment 1, observers viewed two stimuli in sequence on each trial, which could differ in glossiness (nine possible levels) and/or illumination condition (tone mapping present or absent). When both stimuli within a trial were from the same illumination condition they differed by less than 3 gloss levels. Stimuli were rendered under a single illumination field (Figure 9). Within each trial, both stimuli were either presented against the correct background (i.e., that matched the illumination condition: either standard or tone-mapped) or against a gray background of 10 cd/m2. This background luminance, the midpoint of the log luminance range of the tone-mapped stimuli, was approximately equal to the mean luminance of the background in the tone-mapped condition (10.6 cd/m2). Each observer completed 420 trials split across six short sessions. Stimulus order was randomized within and across trials. 
Observers
Nine observers completed the study. All observers, except for RM (one of the authors), were unaware of the purposes of the experiment. All observers had normal or corrected-to-normal vision. The experiment was approved by the University of Bangor Ethics Committee. Subjects gave informed consent prior to participating. 
Results
Similarly to Experiment 1, observer responses were converted to perceived gloss in JNDs via Thurstonian scaling. The results are shown in Figure 9. Similarly to Experiment 1, there was no significant effect of stimulus order: Observers were more likely to identify the stimulus with a higher gloss level as glossier when it was presented in the second interval (p = 0.91) than the first (p = 0.85), but this difference was not significant, t(16) = −2.08, p > 0.05. Moreover, because stimulus order was randomized, any interval bias would not have a systematic effect on reported perceived gloss. 
When the background was absent (left plot), tone mapping had a significant effect on perception: Objects were seen as substantially less glossy: two-factor ANOVA, with stimulus gloss level and tone mapping condition as predictors; main effect of tone mapping, F(1, 8) = 17.3, p < 0.01. On average, tone mapping produced a decrease in perceived gloss of 2.5 JNDs in the background-absent condition. This finding is consistent with our results from Experiment 1: The tone mapping manipulation decreased image contrast and skew. However, our key question is whether observers can compensate for tone mapping (i.e., become gloss constant) when contextual information is available. Our data suggest that observers can, to a large extent, compensate for tone mapping when the background image is available; the effect of tone mapping on perceived gloss was much smaller when the background was present than when it was absent: three-factor ANOVA with gloss level, tone-mapping and background presence/absence reveals a significant interaction between tone-mapping and background presence, F(1, 8) = 5.4, p < 0.05. However, tone mapping does still significantly reduce perceived gloss, even with the background present, F(1, 8) = 10.9, p < 0.05. 
Discussion
The present study investigated the effect of illumination and tone mapping on the perceived glossiness of complex 3D objects. Perceived glossiness was significantly reduced by illumination manipulations that changed the shape of the luminance distribution or the dominant direction of illumination. When contextual illumination information was present, observers did not become gloss constant, but this contextual information did have an effect on perceived gloss. Analyses of some simple image statistics suggest that when judging gloss, observers are sensitive to luminance contrast and skew, both within the object's image and, to some extent, also in the object's surroundings. 
In line with previous literature, we confirm that images of objects that are higher in contrast and skew are perceived as glossier. Importantly, however, observers' estimates of gloss are also predicted by the characteristics of the background: Objects are perceived as less glossy when viewed within environments with greater contrast and skew. In other words, when the background indicates that incoming illumination contains some very bright regions, objects will be perceived as highly glossy only when they produce high-contrast specular highlights. 
Experiment 1 provided some evidence that these contextual effects are asymmetric: Reduced levels of contrast and skew in the background were not associated with substantial increases in perceived gloss. This can be understood in terms of a detection problem: For smoothly curved objects such as our experimental stimuli, the object's image contains reflected light from the entire spherical light field (with the exception of the small region that is directly occluded from view by the object). Thus, visible illuminants will produce visible highlights. In contrast, only a smaller portion of the light field (in angular terms) is directly visible to the observer. It is therefore possible, or indeed probable (given the structure of natural illumination), that a light field will contain bright, high-contrast regions, even when none are visible (e.g., the sun may be directly above the viewer). In other words a low-contrast background does not preclude high-contrast reflections from unseen illuminants. 
It is worth noting that simple statistics, such as luminance and skew, are not the whole story when considering perceived gloss. Clearly, image structure is also important, as demonstrated by Anderson and Kim (2009; Kim & Anderson, 2010). We have characterized the effects of our illumination manipulations in terms of contrast and skew of the object and background, with the advantage that these image properties are simple to objectively quantify and explain a large proportion of the variation in our observers' gloss judgments, including contextual effects. It is certainly possible that our observers were sensitive to other correlated or additional image features. Marlow and colleagues (Marlow & Anderson, 2013; Marlow et al., 2012) asked observers to rate images of glossy objects according to different dimensions of the specular highlights including their coverage, contrast, sharpness, and skew. They found that weighted averages of these ratings were correlated with perceived gloss. 
Previous work has shown that gloss perception can be affected by changes in illumination (Doerschner, Boyaci, & Maloney, 2010; Fleming et al., 2003; Motoyoshi & Matoba, 2012; Olkkonen & Brainard, 2010, 2011; Pont & te Pas, 2006; te Pas & Pont, 2005). We extended that work by using natural illumination fields that we directly manipulated to understand (a) which statistical regularities of natural illumination are relied on when judging gloss, and (b) which of these are estimated from contextual information, when available. Our analyses of natural illumination fields confirmed that natural illumination has a 1/f 2.5 power distribution, a highly skewed luminance distribution, and luminance that increases as a function of elevation. We found that changes in the luminance distribution that removed skew and changes to the dominant illumination direction caused objects to be perceived as less glossy. Altering the spectral power distribution had little effect on perceived gloss (when stimulus objects were viewed against an arbitrary gray background). 
Most previous research has (implicitly or explicitly) assumed that observers rely on a predetermined notion of what natural illumination is, rather than using any online estimation of illumination. In line with this assumption, i.e., that the stimulus background is ineffectual, stimuli have been presented against a plain gray (Doerschner, Boyaci, et al., 2010; Olkkonen & Brainard, 2010; Pont & te Pas, 2006; te Pas & Pont, 2005) or checkered background (Fleming et al., 2003; Olkkonen & Brainard, 2011). This mode of presentation limits the available information that might allow an observer to disentangle the effects of illumination (or tone mapping) from the effects of reflectance. 
We explicitly tested whether the presence of contextual information affects gloss judgments. In Experiment 1, a visible background did not improve gloss constancy but did affect judgments of gloss, in a manner that suggests that observers are sensitive to contrast within the illumination field. Moreover, in our second experiment we found a dramatic improvement in gloss constancy across tone mapping when contextual information was available. 
Finally, it is worth remembering that in natural viewing, gloss constancy may be further improved by exploiting additional information that was not available in the current study including motion (Doerschner et al., 2011; Hartung & Kersten, 2002; Wendt, Faul, Ekroll, & Mausfeld, 2010), binocular disparity (Kerrigan & Adams, 2013; Obein, Knoblauch, & Viénot, 2004; Wendt et al., 2010), texture (te Pas & Pont, 2005), and even haptic cues such as friction (Adams, Kerrigan, & Graf, 2016). It remains to be seen whether observers are able to accurately estimate gloss (surely a useful skill) in naturalistic, full-cue situations. 
Acknowledgments
Supported by an EPSRC grant (EP/K005952/1) to WJA and an NIH grant (EY08266) to MSL. 
Commercial relationships: none. 
Corresponding author: Wendy J. Adams. 
Address: Department of Psychology, University of Southampton, Southampton, UK. 
References
Adams, W. J. (2007). A common light-prior for visual search, shape, and reflectance judgments. Journal of Vision, 7 (11): 11, 1–7, https://doi.org/10.1167/7.11.11. [PubMed] [Article]
Adams, W. J., Elder, J. H., Graf, E. W., Leyland, J., Lugtigheid, A. J., & Muryy, A. (2016). The Southampton-York Natural Scenes (SYNS) dataset: Statistics of surface attitude. Scientific Reports, 6, 35805.
Adams, W. J., Graf, E. W., & Ernst, M. O. (2004). Experience can change the ‘light-from-above' prior. Nature Neuroscience, 7, 1057–1058.
Adams, W. J., Kerrigan, I. S., & Graf, E. W. (2016). Touch influences perceived gloss. Scientific Reports, 6, 21866.
Anderson, B. L., & Kim, J. (2009). Image statistics do not explain the perception of gloss and lightness. Journal of Vision, 9 (11): 10, 1–17, https://doi.org/10.1167/9.11.10. [PubMed] [Article]
Chevreul, M. E. (1855). The principles of harmony and contrast of colours, and their applications to the arts. London, UK: Longman, Brown, Green, and Longmans.
Debevec, P. E., & Malik, J. (1997). Recovering high dynamic range radiance maps from photographs. Paper presented at the Proceedings of SIGGRAPH, 1997, Los Angeles, CA.
Doerschner, K., Boyaci, H., & Maloney, L. T. (2010). Estimating the glossiness transfer function induced by illumination change and testing its transitivity. Journal of Vision, 10 (4): 8, 1–9, https://doi.org/10.1167/10.4.8. [PubMed] [Article]
Doerschner, K., Fleming, R. W., Yilmaz, O., Schrater, P. R., Hartung, B., & Kersten, D. (2011). Visual motion and the perception of surface material. Current Biology, 21, 2010–2016.
Doerschner, K., Maloney, L. T., & Boyaci, H. (2010). Perceived glossiness in high dynamic range scenes. Journal of Vision, 10 (9): 11, 1–11, https://doi.org/10.1167/10.9.11. [PubMed] [Article]
Dror, R. O., Willsky, A. S., & Adelson, E. H. (2004). Statistical characterization of real-world illumination. Journal of Vision, 4 (9): 11, 821–837, https://doi.org/10.1167/4.9.11. [PubMed] [Article]
Field, D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America A: Optics Image Science and Vision, 4, 2379–2394.
Fleming, R. W., Dror, R. O., & Adelson, E. H. (2003). Real-world illumination and the perception of surface reflectance properties. Journal of Vision, 3 (5): 3, 347–368, https://doi.org/10.1167/3.5.3. [PubMed] [Article]
Hansmann-Roth, S., & Mamassian, P. (2017). A glossy simultaneous contrast: Conjoint measurements of gloss and lightness. i-Perception, 8 (1): e2041669516687770.
Hartung, B., & Kersten, D. (2002). Distinguishing shiny from matte. Journal of Vision, 2 (7): 551, https://doi.org/10.1167/2.7.551. [Abstract]
Healy, D. M., Rockmore, D. N., Kostelec, P. J., & Moore, S. (2003). FFTs for the 2-sphere-improvements and variations. Journal of Fourier Analysis and Applications, 9, 341–385.
Heeger, D. J., & Bergen, J. R. (1995). Pyramid-based texture analysis/synthesis. SIGGRAPH 1995, Los Angeles, CA. 229–238.
Hershberger, W. (1970). Attached-shadow orientation perceived as depth by chickens reared in an environment illuminated from below. Journal of Comparative and Physiological Psychology, 73, 407–411.
Ho, Y.-X., Landy, M. S., & Maloney, L. T. (2008). Conjoint measurement of gloss and surface texture. Psychological Science, 19, 196–204.
Hunter, R. S. (1937). Methods of determining gloss. Journal of Research of the National Bureau of Standards, 18, 19–39.
Kerrigan, I. S., & Adams, W. J. (2013). Highlights, disparity, and perceived gloss with convex and concave surfaces. Journal of Vision, 13 (1): 9, 1–10, https://doi.org/10.1167/13.1.9. [PubMed] [Article]
Kim, J., & Anderson, B. L. (2010). Image statistics and the perception of surface gloss and lightness. Journal of Vision, 10 (9): 3, 1–17, https://doi.org/10.1167/10.9.3. [PubMed] [Article]
Kleiner, M., Brainard, D., Pelli, D., Ingling, A., Murray, R., & Broussard, C. (2007). What's new in Psychtoolbox-3. Perception, 36(14), ECVP Abstract Supplement.
Kostelec, P. J., Maslen, D. K., Healy, D. M., & Rockmore, D. N. (2000). Computational harmonic analysis for tensor fields on the two-sphere. Journal of Computational Physics, 162, 514–535.
Landy, M. S. (2007, May 10). Visual perception—A gloss on surface properties. Nature, 447 (7141), 158–159.
Mamassian, P., & Landy, M. S. (2001). Interaction of visual prior constraints. Vision Research, 41, 2653–2668.
Marlow, P. J., & Anderson, B. L. (2013). Generative constraints on image cues for perceived gloss. Journal of Vision, 13 (14): 2, 1–23, https://doi.org/10.1167/13.14.2. [PubMed] [Article]
Marlow, P. J., & Anderson, B. L. (2015). Material properties derived from three-dimensional shape representations. Vision Research, 115 (Part B), 199–208
Marlow, P. J., & Anderson, B. L. (2016). Motion and texture shape cues modulate perceived material properties. Journal of Vision, 16 (1): 5, 1–14, https://doi.org/10.1167/16.1.5. [PubMed] [Article]
Marlow, P. J., Kim, J., & Anderson, B. L. (2012). The perception and misperception of specular surface reflectance. Current Biology, 22, 1909–1913.
Marlow, P. J., Todorović, D., & Anderson, B. L. (2015). Coupled computations of three-dimensional shape and material. Current Biology, 25, R221–R222.
Mollon, J. D. (1987). The origins of modern color science. In Shevell S. (Ed.), The science of color (2nd ed., pp. 1–36). Oxford, UK: Elsevier.
Morgenstern, Y., Geisler, W. S., & Murray, R. F. (2014). Human vision is attuned to the diffuseness of natural light. Journal of Vision, 14 (9): 15, 1–18, https://doi.org/10.1167/14.9.15. [PubMed] [Article]
Motoyoshi, I., & Matoba, H. (2012). Variability in constancy of the perceived surface reflectance across different illumination statistics. Vision Research, 53, 30–39.
Motoyoshi, I., Nishida, S., Sharan, L., & Adelson, E. H. (2007, May 10). Image statistics and the perception of surface qualities. Nature, 447 (7141), 206–209.
Mury, A. A., Pont, S. C., & Koenderink, J. J. (2009). Representing the light field in finite three-dimensional spaces from sparse discrete samples. Applied Optics, 48, 450–457.
Obein, G., Knoblauch, K., & Viénot, F. (2004). Difference scaling of gloss: Nonlinearity, binocularity, and constancy. Journal of Vision, 4 (9): 4, 711–720, https://doi.org/10.1167/4.9.4. [PubMed] [Article]
Olkkonen, M., & Brainard, D. H. (2010). Perceived glossiness and lightness under real-world illumination. Journal of Vision, 10 (9): 5, 1–19, https://doi.org/10.1167/10.9.5. [PubMed] [Article]
Olkkonen, M., & Brainard, D. H. (2011). Joint effects of illumination geometry and object shape in the perception of surface reflectance. i-Perception, 2, 1014–1034.
Pellacini, F., Ferwerda, J. A., & Greenberg, D. P. (2000). Toward a psychophysically-based light reflection model for image synthesis. SIGGRAPH 2000, 55–64.
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442.
Phillips, J. B., Ferwerda, J. A., & Luka, S. (2009). Effects of image dynamic range on apparent surface gloss. 17th Color and Imaging Conference, (Vol. 2009, pp. 193–197). Bellingham, WA: Society for Imaging Science and Technology.
Pont, S. C., & te Pas, S. F. (2006). Material-illumination ambiguities and the perception of solid objects. Perception, 35, 1331–1350.
Ruderman, D. L. (1994). The statistics of natural images. Network: Computation in Neural Systems, 5, 517–548.
Seetzen, H., Heidrich, W., Stuerzlinger, W., Ward, G., Whitehead, L., Trentacoste, M.,… Vorozcovs, A. (2004). High dynamic range display systems. ACM Transactions on Graphics, 23, 760–768.
Sharan, L., Li, Y., Motoyoshi, I., Nishida, S. y., & Adelson, E. H. (2008). Image statistics for surface reflectance perception. Journal of the Optical Society of America A, 25, 846–865.
Snyder, J. L., Doerschner, K., & Maloney, L. T. (2005). Illumination estimation in three-dimensional scenes with and without specular cues. Journal of Vision, 5 (10): 8, 863–877, https://doi.org/10.1167/5.10.8. [PubMed] [Article]
te Pas, S. F., & Pont, S. C. (2005). A comparison of material and illumination discrimination performance for real rough, real smooth and computer generated smooth spheres. Paper presented at the Proceedings of the 2nd symposium on Applied Perception in Graphics and Visualization, La Coruña, Spain.
Teller, S., Antone, M., Bodnar, Z., Bosse, M., Coorg, S., Jethwa, M., et al. (2003). Calibrated, registered images of an extended urban area. International Journal of Computer Vision, 53, 93–107.
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273–286.
Tolhurst, D. J., Tadmor, Y., & Chao, T. (1992). Amplitude spectra of natural images. Ophthalmic and Physiological Optics, 12, 229–232.
Tumblin, J., Hodgkins, J. K., & Guenter, B. K. (1999). Two methods for display of high contrast images. ACM Transactions on Graphics, 18, 56–94.
Wanat, R., Petit, J., & Mantiuk, R. (2012). Physical and perceptual limitations of a projector-based high dynamic range display. In Carr H. & Czanner S. (Eds.), Theory and practice of computer graphics. Rutherford, UK: The Eurographics Association.
Ward, G. J. (1992). Measuring and modeling anisotropic reflection. SIGGRAPH Computer Graphics, 26, 265–272.
Wendt, G., Faul, F., Ekroll, V., & Mausfeld, R. (2010). Disparity, motion, and color information improve gloss constancy performance. Journal of Vision, 10 (9): 7, 1–17, https://doi.org/10.1167/10.9.7. [PubMed] [Article]
Wills, J., Agarwal, S., Kriegman, D., & Belongie, S. (2009). Toward a perceptual space for gloss. ACM Transactions on Graphics, 28 (4): 103:1– 103:15.
Yeshurun, Y., Carrasco, M., & Maloney, L. T. (2008). Bias and sensitivity in two-interval forced choice procedures: Tests of the difference model. Vision Research, 48, 1837–1851.
Footnotes
1  Changes in surface shape are not considered further here, but their effect on perceived gloss has been discussed elsewhere (Ho, Landy, & Maloney, 2008; Marlow & Anderson, 2015, 2016; Marlow, Todorović, & Anderson, 2015).
Figure 1
 
(a) Simple reflectance functions, showing variation in specularity and micro-roughness. (b) The ill-posed problem of estimating reflectance: Image changes can be caused by changes in reflectance, illumination, and/or tone-mapping. In each image one property (illumination, reflectance, or tone mapping) was modified while the other two were kept constant.
Figure 1
 
(a) Simple reflectance functions, showing variation in specularity and micro-roughness. (b) The ill-posed problem of estimating reflectance: Image changes can be caused by changes in reflectance, illumination, and/or tone-mapping. In each image one property (illumination, reflectance, or tone mapping) was modified while the other two were kept constant.
Figure 2
 
Eight example light-fields from the SYNS dataset. Some light-fields that were captured in full sun included an artifact below the sun—a bright narrow vertical strip—this has been removed by replacing the affected part of the image with corresponding pixels from a second image, captured a few minutes later.
Figure 2
 
Eight example light-fields from the SYNS dataset. Some light-fields that were captured in full sun included an artifact below the sun—a bright narrow vertical strip—this has been removed by replacing the affected part of the image with corresponding pixels from a second image, captured a few minutes later.
Figure 3
 
Luminance distributions (left) and luminance as a function of elevation (right). The luminance scale is arbitrary. Rows represent light-fields from indoor (upper), outdoor cloudy (middle), and outdoor sunny scenes (lower). Each gray line corresponds to a single scene; black lines show the median, and green lines show the 10th and 90th percentiles. Note that our analyses used 2D images created via equirectangular projection—each pixel covers the same angular range in elevation (θ) and azimuth, rather than equal solid angles on the view sphere. To compensate for this, frequency was weighted by sin(θ), where θ = 0° is downward and θ = 90° is towards the horizon.
Figure 3
 
Luminance distributions (left) and luminance as a function of elevation (right). The luminance scale is arbitrary. Rows represent light-fields from indoor (upper), outdoor cloudy (middle), and outdoor sunny scenes (lower). Each gray line corresponds to a single scene; black lines show the median, and green lines show the 10th and 90th percentiles. Note that our analyses used 2D images created via equirectangular projection—each pixel covers the same angular range in elevation (θ) and azimuth, rather than equal solid angles on the view sphere. To compensate for this, frequency was weighted by sin(θ), where θ = 0° is downward and θ = 90° is towards the horizon.
Figure 4
 
Spectral power distribution. The averaged squared coefficients per order, as a function of spherical harmonic order in luminance (left column) or log luminance (right column).
Figure 4
 
Spectral power distribution. The averaged squared coefficients per order, as a function of spherical harmonic order in luminance (left column) or log luminance (right column).
Figure 5
 
Above: an example light-field in its original form (standard), and after the three different illumination manipulations. Below: example stimuli, rendered under standard illumination, to illustrate the nine different stimulus gloss levels.
Figure 5
 
Above: an example light-field in its original form (standard), and after the three different illumination manipulations. Below: example stimuli, rendered under standard illumination, to illustrate the nine different stimulus gloss levels.
Figure 6
 
Stimuli rendered under standard, uniform, half-slope, and Halloween light-fields. On the left, moderately glossy stimuli (level 3) are presented with the background visible. On the right, high-gloss stimuli (level 7) are presented against a gray background, rather than the true (rendering) context. Below, the schematic shows the slow reveal used in Experiment 1.
Figure 6
 
Stimuli rendered under standard, uniform, half-slope, and Halloween light-fields. On the left, moderately glossy stimuli (level 3) are presented with the background visible. On the right, high-gloss stimuli (level 7) are presented against a gray background, rather than the true (rendering) context. Below, the schematic shows the slow reveal used in Experiment 1.
Figure 7
 
Data from Experiment 1 for the background-absent (top row) and background-present (bottom row) conditions. First column: Example data for one naïve observer. Second column: Group data, averaged across all observers. Third column: Summary data: mean perceived gloss for each illumination manipulation, averaged across gloss levels. Fourth column: Sensitivity, defined as the difference in perceived gloss, in JND units, between the most glossy and most matte stimuli. Error bars: ±1 SE across observers. Curves are 2nd-order polynomial fits to the data.
Figure 7
 
Data from Experiment 1 for the background-absent (top row) and background-present (bottom row) conditions. First column: Example data for one naïve observer. Second column: Group data, averaged across all observers. Third column: Summary data: mean perceived gloss for each illumination manipulation, averaged across gloss levels. Fourth column: Sensitivity, defined as the difference in perceived gloss, in JND units, between the most glossy and most matte stimuli. Error bars: ±1 SE across observers. Curves are 2nd-order polynomial fits to the data.
Figure 8
 
Simple statistics characterizing the luminance distribution within the object's image (top row) or the background context (bottom row), averaged across light fields and random variations in object shape.
Figure 8
 
Simple statistics characterizing the luminance distribution within the object's image (top row) or the background context (bottom row), averaged across light fields and random variations in object shape.
Figure 9
 
Experiment 2. (a) Tone-mapping function used to map high dynamic values to a simulated low-dynamic-range display (blue line). The identity function (slope 1) is shown as a dashed black line for a reference. (b) Perceived gloss of stimuli with or without tone mapping when presented without the background, and (c) with the background present.
Figure 9
 
Experiment 2. (a) Tone-mapping function used to map high dynamic values to a simulated low-dynamic-range display (blue line). The identity function (slope 1) is shown as a dashed black line for a reference. (b) Perceived gloss of stimuli with or without tone mapping when presented without the background, and (c) with the background present.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×