Open Access
Article  |   March 2023
Texture statistics involved in specular highlight exclusion for object lightness perception
Author Affiliations
  • Hiroki Nohira
    Department of Information and Communications Engineering, Tokyo Institute of Technology, Nagatsuta-cho, Midori-ku, Yokohama, Japan
    nohira.h.aa@m.titech.ac.jp
  • Takehiro Nagai
    Department of Information and Communications Engineering, Tokyo Institute of Technology, Nagatsuta-cho, Midori-ku, Yokohama, Japan
    nagai.t.aa@m.titech.ac.jp
Journal of Vision March 2023, Vol.23, 1. doi:https://doi.org/10.1167/jov.23.3.1
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Hiroki Nohira, Takehiro Nagai; Texture statistics involved in specular highlight exclusion for object lightness perception. Journal of Vision 2023;23(3):1. https://doi.org/10.1167/jov.23.3.1.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

The human visual system estimates the physical properties of objects, such as their lightness. Previous studies on the lightness perception of glossy three-dimensional objects have suggested that specular highlights are detected and excluded in lightness perception. However, only a few studies have attempted to elucidate the mechanisms underlying this exclusion. This study aimed to elucidate the image features that contribute to the highlight exclusion of lightness perception. We used Portilla-Simoncelli texture statistics (PS statistics), an image feature set similar to the representation in the early visual cortex, to explore their relationships with highlight exclusion for lightness perception. In experiment 1, computer graphics images of bumpy plastic plates with various physical parameters were used as stimuli, and the lightness perception on them was measured using a lightness matching task. We then calculated the highlight exclusion index, which represented the degree of highlight exclusion. Finally, we evaluated the correlation between the highlight exclusion index and the four PS statistic subsets. In experiment 2, an image synthesis algorithm was used to create images in which either the PS statistic subset was manipulated. The highlight exclusion indexes of the synthesized images were then measured. The results revealed that the PS statistic subset consisting of lowest-order image features, such as moment statistics of luminance, acts as a necessary condition for highlight exclusion, whereas the other three subsets consisting of higher order features are not crucial. These results suggest that the low-order image features are the most important among the features in PS statistics for highlight exclusion, even though image features higher order than those in PS statistics must be directly involved.

Introduction
At first glance, human beings can judge various types of surface qualities of objects. Lightness is a typical perceptual feature of object surface quality related to diffuse reflectance (Arend & Spehar, 1993; Adelson, 2000, Todd, Norman, & Mingolla, 2004). Numerous studies have investigated lightness perception using simple two-dimensional images as experimental stimuli (Adelson, 2000; Anderson & Winawer, 2005; Kim, Gold, & Murray, 2018; Murray, 2020) and discovered some basic strategies for the lightness perception of simple visual stimuli. For instance, Anderson and Winawer (2005) proposed a new illusion in which identical luminance patterns seem to be white or black, depending on the surrounding luminance patterns. This illusion suggests that the decomposition of a scene into multiple layers may contribute to lightness perception. Kim et al. (2018) applied the psychophysical reverse correlation method to the argyle illusion (Anderson, 1993) and extracted crucial spatial regions in the stimuli that are important for the illusion in humans and different computational models of lightness perception. The results showed that the regions extracted by each model differed from those extracted by humans. In other words, the lightness perception mechanisms are not yet been fully understood. 
Furthermore, the mechanisms of lightness perception in realistic and three-dimensional (3D) scenes are more complicated. Scenes that humans see daily are much more complex than simple stimuli. For instance, light entering the human eye reflected from an object and retinal images depends not only on the object's surface properties, but also on the interactions between multiple factors, such as object reflectance, object shape, illumination, and scene geometry. Additionally, it is ill-posed to estimate the object reflectance only from retinal images. Therefore, estimating the diffuse reflectance of object surfaces as lightness perception in real scenes is a highly difficult task for the visual system because of the complicated physical factors that affect retinal images. Indeed, the perception of object surface colors is affected by specular reflection and 3D shapes of objects (Xiao & Brainard, 2008; Schmid & Anderson, 2014; Honson, Huynh-Thu, Arnison, Monaghan, Isherwood, & Kim, 2020). 
The visual system is thought to solve this difficult problem by using image features in visual scenes as heuristics (Fleming, 2014). For instance, the visual system is suggested to rely on high luminance regions in object images, not simply on mean luminance, for lightness perception (Toscani, Valsecchi, & Gegenfurtner, 2013). Toscani et al. (2013) measured eye movements during a color-matching task on real matte objects. In their results, the observer tended to look at bright regions on object surfaces. In this case, the “high luminance” may be heuristic for lightness perception. 
Specular highlights are also relevant for the lightness perception of 3D object images. For instance, luminance skewness on object images is negatively correlated with perceived lightness when the mean luminance is fixed (Motoyoshi, Nishida, Sharan, & Adelson, 2007; Sharan, Li, Motoyoshi, Nishida, & Adelson, 2008). In other words, the highlight luminance components represented in the right tail of the luminance histogram are not considered to predominantly contribute to lightness perception. Indeed, some studies have reported that highlight regions tend to be excluded in object lightness and color perception in 3D object images (Todd et al., 2004; Toscani, Valsecchi, & Gegenfurtner, 2017; Honson et al., 2020). Todd et al. (2004) conducted psychophysical experiments to compare the perceived lightness and highlight luminance. Their stimuli were patches on 3D objects whose diffuse and specular reflectance were independently manipulated. The patch luminance on the highlight regions was 150% to 300% higher than that of the test patches on non-highlight regions. Nevertheless, the increase in lightness perception from the nonhighlight regions to the highlight regions was approximately 5%. Toscani et al. (2017) explored the luminance percentile to estimate the diffuse reflectance of glossy objects by using a linear discriminant model. They found that informative luminance exists up to approximately the 80th percentile (corresponding with the nonhighlighted areas), beyond which the estimation performance degrades significantly. Furthermore, psychophysical experiments have shown that the perceived lightness of object images is significantly modulated by luminance modulation at the 60th to 70th percentiles, but not at the 90th to 100th percentiles. Similarly, Honson et al. (2020) measured object color perception in images of chromatic, bumpy, and glossy plates in a psychophysical experiment. On stimuli with ambiguous highlight regions, owing to the large micro-roughness of the object surface, perceived lightness was high and perceived saturation was low. By contrast, the perceived lightness and saturation were low and high, respectively, on images with clear highlights owing to the small roughness. These results suggest that the visual system detects highlights for lightness and color perception and then excludes their effects on perception. 
Image features and mechanisms involved in highlight exclusion for lightness perception have hardly been investigated. However, several studies have tried to elucidate the mechanisms of glossiness perception, which is closely related to highlights (e.g., Motoyoshi et al., 2007; Anderson & Kim, 2009; Wiebel, Toscani, & Gegenfurtner, 2015; Prokott, Tamura, & Fleming, 2021; Storrs, Anderson, & Fleming, 2021). Some studies suggest the involvement of lower order image statistics in glossiness perception. For instance, Motoyoshi et al. (2007) reported that low-order image features, such as luminance skewness on object surfaces, correlated with glossiness perception under certain conditions. Additionally, Wiebel et al. (2015) investigated the correlation between glossiness perception and low-order image features in various natural images and the change in glossiness perception induced by modulations of the features. In the results, the standard deviation of luminance is more influential on perceived glossiness than the luminance skewness. Contrastingly, a series of studies (Anderson & Kim, 2009; Kim, Marlow, & Anderson, 2011; Marlow, Kim, & Anderson, 2011) showed that the lack of spatial congruence of highlights and shadows significantly decreased glossiness perception, regardless of the lower order luminance statistics. They claimed that photogeometric heuristics such as the spatial congruence of highlights and shadings, not only lower order image features, are crucial for glossiness perception. Recently, artificial neural networks have been proposed as a tool to explore complicated image features contributing to glossiness perception (Nishida, 2019). For instance, Storrs et al. (2021) used an unsupervised neural network model, PixelVAE (Gulrajani et al., 2016; Zhao, Song, & Ermon, 2017), to investigate the relationship between human glossiness perception and information represented in unsupervised neural network models. They showed that the statistical structure in object images compressed into latent vector representations (possibly higher order image features) matched the human glossiness perception to a certain degree. However, this latent representation is insufficient to capture all properties of glossiness perception; it cannot capture the highlight–shading congruence but can capture three perceptual highlight features (sharpness, contrast, and coverage) supporting glossiness perception (Marlow, Kim, & Anderson, 2012). 
Although both lower and higher order image features may contribute to highlight exclusion, how can their contributions be separated? First, low-order image features in object images often simultaneously modulate high-order image features. For instance, glossiness change after adapting to luminance-skewed noise images (Motoyoshi et al., 2007) may be induced by perceptual modulation of highlight edges caused by adaptation rather than by direct effects of luminance statistics (Kim, Tan, & Chowdhury, 2016). Thus, it is difficult to determine whether the visual system relies on low-order image features or covarying high-order image features in experiments with modulations of simple image features. Contrastingly, artificial neural networks enable the capture of higher order image features, as described elsewhere in this article. For instance, Prokott et al. (2021) used various convolutional neural networks and showed that glossiness perception can be explained well by computational mechanisms that imitated the early- to mid-level visual processing of the human visual system. This result provides clues regarding the visual processing levels involved in glossiness perception. However, the abstract meaning of the image features represented in the intermediate layers of convolutional neural networks is difficult for humans to interpret. Thus, it remains challenging to elucidate the visual processing mechanisms involved in detail. 
Using Portilla-Simoncelli texture statistics (PS) statistics (Portilla & Simoncelli, 2000) may be a possible approach to separating different levels of image features according to their impact on perception. PS statistics are a set of manually designed image features proposed for texture synthesis techniques. They are also used to analyze the properties of texture perception and peripheral vision (Freeman & Simoncelli, 2011; Okazawa, Tajima, & Komatsu, 2014) and comprise approximately 1,000 image features, although the number of features depends on the hyperparameters. They contain low-order image features, such as luminance contrast and skewness, and image features that digital filters can capture, such as spatial frequency and orientation, similar to image representations in simple and complex cells in V1 (Hubel & Wiesel, 1962). Furthermore, they also have more complex features like correlations between image features described elsewhere in this article, which are considered similar to those represented in V2 (Freeman, Ziemba, Heeger, Simoncelli, & Movshon, 2013; Ziemba, Freeman, Simoncelli, & Movshon, 2018; we refer to these complex features in PS statistics as quasi-low-order image features in this paper). Thus, using PS statistics allows us to quantitatively assess the impact of image features in a one-step higher order than the lower order image features on perception. Additionally, they can provide us with the interpretability of the meaning of the involved image features and visual information processing levels, such as V1 and V2. 
This study aimed to elucidate the image features based on which the visual system excludes highlights for lightness perception using PS statistics. In Experiment 1, lightness perception was psychophysically measured using a lightness matching task on computer graphics images of bumpy plates with various physical parameters. Then, we calculated a “highlight exclusion index” (HEI), which represents the degree of highlight exclusion for lightness perception from the results. Finally, we evaluated the correlation between the HEI and each of the four different PS statistic subsets to identify PS statistic subsets that could serve as cues for highlight exclusion. In Experiment 2, synthesized images based on PS statistics, in which some PS statistics were randomized, were used as the stimuli. As in Experiment 1, we measured their perceived lightness and calculated highlight exclusion indices. Then, the relationship between the highlight exclusion indices and randomized PS statistic subsets was examined. The results suggest that only the PS statistic subset consisting mainly of lower order image features, such as luminance moment statistics, functions as a necessary condition for highlight exclusion, whereas the other three PS statistic subsets consisting of quasi-low-order image features do not. 
Experiment 1: Correlation of PS statistics and lightness perception
Experiment 1 aimed to explore candidate PS statistic subsets that could serve as cues for highlight exclusion in lightness perception. In this experiment, we analyzed the correlation between the degree of highlight exclusion and each PS statistic subset. The PS statistic subsets with high correlations can be considered candidates cues for highlight exclusion. 
Methods
Observers
Six observers participated in Experiment 1. One was the author H.N., and the others were undergraduate or graduate students at Tokyo Institute of Technology. All the observers were males in their 20s. Visual acuities or corrected visual acuities were 0.4 or better, as assessed using the Landolt ring chart. In casual observation, it was confirmed that their visual acuities were sufficient to perceive the shapes of all stimulus object images, which were relevant to lightness perception. The Ethical Review Committee of Tokyo Institute of Technology approved the contents of Experiment 1 based on the Declaration of Helsinki. All observers were informed about the experiment in advance and provided written informed consent. 
Apparatus
Experiments were conducted in a darkroom. An LCD (KIPD4K156, KEIAN, Japan) was placed in a darkroom to present the stimuli. The screen resolution and size of the display were 2,160 (vertical) × 3,840 (horizontal) pixels and 19.4 × 34.5 cm at a refresh rate of 30 Hz. The gamma characteristics and spectral distributions of the RGB channels were carefully measured using a colorimeter (ColorCAL II, Cambridge Research Systems, UK) and spectroradiometer (Specbos 1211-2, JETI Technische Instrumente GmbH, Jena, Germany), respectively, to ensure accurate luminance and color presentation. The display was connected to a desktop computer (Pavilion Desktop 595, HP Japan Inc., Japan, CPU: Intel Core i5-9400 [2.90 GHz × 6 cores], memory: 8 GB, GPU: Radeon RX550). MATLAB R2020a (MathWorks, Natick, MA) and Psychtoolbox3 (Brainard, 1997; Kleiner, Brainard, & Pelli, 2007) on Ubuntu 18.04 LTS controlled the experimental procedure. 
The observer's head was approximately fixed by a chin rest during the experiment at a viewing distance of approximately 50 cm. Additionally, one eye was covered with a black eye shield supported by the observer's hand; thus, the observer viewed the stimulus through one eye. This eye shield was used to enhance 3D impressions of the stimuli by removing the binocular disparity from the flat display (Phuangsuwan, Ikeda, & Katemake, 2013). Casual observations before the experiment confirmed that this monocular observation indeed elicited a stronger 3D impression. The mouse was used by the observers to respond. 
Stimulus: Overview
A flowchart of stimulus creation is shown in Figure 1. All stimuli were computer graphics images rendered with Mitsuba 0.6 (Jakob, 2010). After generating the images, a tone mapping procedure was applied to fit the luminance into the display gamut, as described elsewhere in this article. Figure 2 shows an example of a stimulus displayed on a screen. The test stimulus was an image of a bumpy plate on the left, and the reference stimulus was an image of a matte sphere on the right. The sizes of the stimuli are shown in Figure 2. There was a vertical black space with a width of 120 pixels at the top and bottom of the screen, with RGB pixel values of zero. 
Figure 1.
 
Flowchart of stimulus creation in Experiment 1.
Figure 1.
 
Flowchart of stimulus creation in Experiment 1.
Figure 2.
 
An example stimulus in a trial. The text and arrows were not presented during the experiment.
Figure 2.
 
An example stimulus in a trial. The text and arrows were not presented during the experiment.
Stimulus: Rendering and physical parameters
The four environment maps shown in Figure 3 were used as light sources in the rendering. These were obtained from Poly Haven (https://polyhaven.com/). The notations in parentheses in the figure indicate the file names. These environment maps were arbitrarily selected by the authors to ensure diversity in terms of whether the lighting was indoor or outdoor, and whether the lighting was highly or poorly localized. 
Figure 3.
 
Environment maps. The string below each image indicates its file name.
Figure 3.
 
Environment maps. The string below each image indicates its file name.
The test stimuli were images of bumpy plates whose surface reflection properties and bumpy shapes were the experimental parameters. Because PS statistics are developed to represent textures, they assume the periodicity of the images. Therefore, the objects used as test stimuli were designed to be spatially periodic, considering their familiarity with PS statistics. 
We used two types of surface reflectance properties: diffuse (Lambertian reflection) and roughplastic (referred to as plastic hereafter, for simplicity), defined in Mitsuba. In both types, the diffuse reflectance was fixed at a low value of 0.1 to make the highlights clearly visible. For the plastic, the surface roughness was represented using a Beckmann distribution. The specular reflectance was set to 1, which is consistent with the physical constraint. Because the diffuse and specular reflectance were the same for all the stimuli, the differences in perceived lightness and glossiness, if any, must have been derived from the effects of other parameters. The roughness parameter α was 0.025, 0.050, 0.100, or 0.200. The other reflection properties were set to the default values in Mitsuba. 
The parameters of the depth shape were characterized by their spatial frequencies and amplitudes. The baseline depth shape was that of an actual plastic plate for a printer panel measured using a 3D shape-measuring instrument (VR-3000, KEYENCE, Osaka, Japan). Although the obtained depth image had 720 (height) × 970 (width) pixels, the central area of 360 × 465 pixels was cropped when used to define the object shapes. Mitsuba's heightfield function modulated the shape of the flat plate based on this cropped depth image to create a bumpy shape. 
The spatial frequency of the plate shape was controlled by the number of repetitions of the cropped depth images. When applied to the plate shape, the depth image was repeated in the vertical and horizontal directions using repeated mirror image inversion. The number of repetitions was 1, 2, 4, or 8. Because the plate size was the same across the number of repetitions, the number was proportional to the spatial frequency of the depth patterns. Thus, in this study, the number of repetitions was referred to as the frequency. We chose the plate size such that the shape size with a frequency of 8 was almost the same as that in the actual plate. The spatial frequency power of a stimulus with a frequency of 1 is shown in Figure S1.1 in Supplementary Materials. The power was mainly concentrated in the range of 0 to 5 cycles/degree at a frequency of 1. 
The amplitude of the depth shape was modulated by the parameter depth coefficient of 0, 1, 2, 4, or 8. The depth coefficient was not directly used to modulate the depth, but a coefficient determined from each depth coefficient and frequency was multiplied by the depth image. The multiplied coefficient ka for the depth coefficient kd and frequency f was calculated as ka = 8kd/f. For instance, the multiplied coefficients were 0, 2, 4, 8, and 16 for a frequency of 4. Note that a depth coefficient of 0 corresponds with a flat plate. 
In summary, there were 16 test stimuli for diffuse plates with frequencies of 1, 2, 4, and 8 and depth coefficients of 1, 2, 4, and 8. There were 64 test stimuli of plastic plates with roughness values of 0.025, 0.050, 0.100, and 0.200, frequencies of 1, 2, 4, and 8, and depth coefficients of 1, 2, 4, and 8. There were also four plastic plates with roughness values of 0.025, 0.050, 0.100, and 0.200 and a depth coefficient of 0. Therefore, 84 different objects were used in this study. Each object was rendered under four lighting conditions, resulting in a total of 336 test stimuli. Some of the test stimuli with different parameters are shown in Figure S1.2
The reference stimulus was an image of a diffuse sphere whose reflectance could be adjusted by the observer. The diffuse reflectance ranged from 0.068 to 0.591 in uniform 42 steps on the logarithmic luminance scale to achieve a perceptually uniform change in reflectance during adjustment approximately. The reference stimuli were rendered using the same four environment maps as those of the test stimuli. Test and reference stimuli rendered in the same environment map were presented in pairs for each trial. Similar reference stimuli have often been used in previous studies employing asymmetric lightness matching (Toscani et al., 2017; Honson et al., 2020). 
In addition, a shelf with a checkerboard texture was placed behind the test and reference objects. The shelf comprised diffuse surfaces with diffuse reflectance of 0.2 and 0.4 and was not in contact with the test or reference objects. The shelf was used to make the environment maps invisible and to prevent the underside of the reference sphere from being unnaturally dark. 
Stimulus: Tone mapping
The luminance of the rendered images was adjusted using a tone mapping procedure for each environment map after rendering so that the images on the screen appeared natural. The resultant image of the spectral rendering was represented in CIE 1931 XYZ. The image was then made achromatic by setting the chromaticity at D65 (x, y) = (0.313, 0.329). Finally, luminance Y in each environment map was manipulated as follows: For the images rendered under the urban environment map, Y was multiplied by a constant k = 50. In contrast, for the stimuli in the images rendered under the other three environment maps, nonlinear tone mapping was applied to Y because there were pixels with extremely high luminance. The tone mapping operator is given by Equation (1):  
\begin{equation}f\left( Y \right) = a\left( {\frac{Y}{{1 + Y}}} \right)\left( {1 + \frac{Y}{b}} \right),\end{equation}
(1)
where Y and f(Y) represent the luminance before and after tone mapping, respectively, and a and b are the parameters used to control the tone mapping properties. The values of these parameters are listed in Table 1. The authors arbitrarily determined these parameter values based on casual observations to achieve the most natural appearance of the stimuli under each environment map. 
Table 1.
 
Parameters for luminance manipulations on rendered images.
Table 1.
 
Parameters for luminance manipulations on rendered images.
Stimulus: Relationship with PS statistics
This study aimed to find PS statistic subsets relevant to highlight–exclusion performance. Before the experiment, we examined how well the PS statistics represented the material impressions of the test stimuli by creating synthesized images based on their PS statistics. The image synthesis procedure was the same as that used in Experiment 2, which is described later. Some synthetic images are shown in Figure 4: two stimulus images (Figures 4a and c) and their synthesized images (Figures 4b and d). The synthesized images seemed to have similar material impressions as the stimulus images, suggesting that PS statistics can represent the material impressions of our stimuli to some extent. Notably, the synthesized images do not necessarily appear as objects with the same material because the PS statistics do not consider the physical sources of the luminance patterns at all. In particular, the PS statistics cannot reproduce the material impressions of object images whose object contours are visible at all. We speculate that the PS statistics successfully represented the material impression of our stimuli because our stimuli were periodic. 
Figure 4.
 
(a, b) Two stimulus images in Experiment 1. (c, d) Synthesized images created from a white noise image based on PS statistics of (a) and (b), respectively
Figure 4.
 
(a, b) Two stimulus images in Experiment 1. (c, d) Synthesized images created from a white noise image based on PS statistics of (a) and (b), respectively
Procedure
Before the experiment, the observer entered the dark room and covered one eye with an eye shield. The experiment was initiated by pressing the mouse wheel. A stimulus such as that shown in Figure 1 was presented to the observer in each trial. The observer adjusted the diffuse reflectance of the reference stimulus by rotating the mouse wheel so that its perceived lightness matched that of the test stimulus. The adjustment time was not restricted. He was carefully instructed to adjust the lightness so that “the left and right objects appeared to be painted with the same paint.” This instruction was similar to that used in a previous study (Toscani et al., 2017). This instruction prevented the observers from confusing brightness with lightness. When the observer was satisfied with the adjustment, they pressed the left mouse button to complete the trial. Subsequently, the test and reference stimuli disappeared, and a shelf-only background image was presented for 1 second. The subsequent trial was then started, and the next test and reference stimuli were presented. 
Each session comprised 1-minute practice trials and 42 trials. Practice trials were conducted for the first minute to help observers adapt to the stimulus environment and familiarize themselves with the task. The test stimuli in the practice trials were selected randomly from those used in the main trials. The results of the practice trials were excluded from the analysis. Stimuli in the same environment map were consistently used in all trials in a single session to maintain a stable adaptation state during the session. Each observer responded twice to each of the 336 test stimuli: once in the first eight sessions and once in the second eight sessions. One-half of the observers saw the stimuli with their left eye in odd-numbered sessions and with their right eye in even-numbered sessions. The other one-half of the observers performed opposite tasks. The environment maps were changed every two sessions. The order of the environment maps was randomized in the first and second halves of the sessions per observer. The order of presentations of the test stimuli was also randomized. Each observer completed 16 sessions, resulting in 672 trials (= 336 × 2), excluding the practice trials. All observers completed all sessions in 5 days. 
Results
Figure 5 shows the matched diffuse reflectance of the reference stimuli adjusted by the observers. This figure presents the results of the ‘gamrig’ environment map. Unfortunately, because we used different reference stimuli among the environment maps, it was impossible to compare the results across them directly. Therefore, the results for the different environment maps were analyzed separately in the following analysis. The panels correspond with the roughness, the horizontal axis indicates the frequency, and the chart color indicates the depth coefficient. The results of the other environmental maps are shown in Supplementary Material S2 (Supplementary Figures S2.1S2.3) in the same format. 
Figure 5.
 
Matched diffuse reflectance adjusted by the observers for ‘gamrig’ environment map. The panels correspond with the results of plastic images with different roughness and diffuse images. The error bars show the standard errors of mean (SEMs). The vertical axis is shown in the logarithm scale. The SEMs were also calculated for the logarithm scale of reflectance.
Figure 5.
 
Matched diffuse reflectance adjusted by the observers for ‘gamrig’ environment map. The panels correspond with the results of plastic images with different roughness and diffuse images. The error bars show the standard errors of mean (SEMs). The vertical axis is shown in the logarithm scale. The SEMs were also calculated for the logarithm scale of reflectance.
These charts show the complex relationships between the parameters and the matched reflectance. For instance, the matched reflectance for plastic plates seems to increase with roughness. In other words, the perceived lightness was lower for plates with lower roughness. The matched reflectance also appeared to be lower for low-frequency and large depth coefficients plastic stimuli. Contrastingly, the results for the diffuse plates did not show strong effects of the depth coefficient and frequency, suggesting that some factors influencing lightness perception exist only in plastic plates, but not in diffuse plates. We statistically analyzed the main effects and interactions between stimulus parameters. We performed a three-way analysis of variance on the matched reflectance of the bumpy plastic plates under each environment map, in which the factors were frequency, depth coefficient, and roughness. The results are presented in Table 2. Statistical significance was found for the main effect of roughness under all environment maps. This main effect is consistent with the results of a previous study on the lightness and saturation perception of glossy plates (Honson et al., 2020). On high roughness surfaces, highlight edges become blurred, and the visual system may no longer detect and exclude highlights correctly. Consequently, specular reflections may have been misinterpreted as diffuse reflections, leading to greater perceived lightness. Conversely, the main effect of the depth coefficients and the interaction between depth coefficients and roughness were also found in some environment maps. These trends are not necessarily consistent with the results of Honson et al. (2020). In any case, because both the depth coefficients and roughness affect specular highlights, their significant effects raise the possibility that specular highlights are related to lightness perception. 
Table 2.
 
Results of three-way analysis of variance for each environment map. Only the conditions with statistical significance are shown.
Table 2.
 
Results of three-way analysis of variance for each environment map. Only the conditions with statistical significance are shown.
In the following subsections, we focus on the image features correlated with variations in the matched reflectance among the physical parameters. 
Relationship with mean luminance
A candidate of simple image features that contribute to lightness perception is the mean luminance of the image. The observer likely judged lightness based on the mean luminance in the object images with spatial luminance variations. Figures 6a and b show the matched reflectance and mean luminance of the bumpy plastic stimuli, respectively, as a function of the roughness in each environment map. In this figure, the mean luminance was calculated on a logarithmic scale. Additionally, the resultant values were averaged across the frequencies and depth coefficients. Both the matched reflectance and mean luminance showed similar monotonically increasing trends. 
Figure 6.
 
(a) Matched reflectance and (b) mean luminance of bumpy plastic stimuli as a function of roughness. The values were averaged across depth coefficients and frequencies. The error bars show standard errors of mean across the frequencies and depth coefficients.
Figure 6.
 
(a) Matched reflectance and (b) mean luminance of bumpy plastic stimuli as a function of roughness. The values were averaged across depth coefficients and frequencies. The error bars show standard errors of mean across the frequencies and depth coefficients.
The direct relationship between the mean luminance and matched reflectance for bumpy plastic stimuli is shown in Figure 7. The correlation coefficients were calculated for all stimuli except the diffuse plates and depth coefficient of 0 plastic plates in each environment map. There were high correlations between the mean luminance and matched reflectance, with correlation coefficients greater than 0.8, except entrance. This finding supports the possibility that the mean luminance serves as a cue for lightness judgment. However, the mean luminance cannot fully explain the matched reflectance. For instance, in the photo environment map, the two plots seem much lower in matched reflectance than the general trend of the other plots, and there are several vertically aligned plots near 7.0 cd/m2. These observations raise the possibility that image factors other than the mean luminance could also affect perceived lightness. 
Figure 7.
 
Relationship between mean luminance of plastic stimuli and matched reflectance for every environment map. Red plots indicate the standard stimuli (flat plastic plates), and blue plots indicate other test stimuli. Error bars show standard errors of mean. The black line is a linear regression line for the four black plots. Both the vertical and horizontal lines are shown in the logarithm scale.
Figure 7.
 
Relationship between mean luminance of plastic stimuli and matched reflectance for every environment map. Red plots indicate the standard stimuli (flat plastic plates), and blue plots indicate other test stimuli. Error bars show standard errors of mean. The black line is a linear regression line for the four black plots. Both the vertical and horizontal lines are shown in the logarithm scale.
Highlight exclusion index
Specular highlights are one candidate image feature other than the mean luminance that affected the matched lightness. In Figure 7, there are several plots whose matched reflectance is much lower than the general trend. This deviation may have been caused by a strategy in which specular highlight regions were ignored or disregarded when calculating the mean image luminance for lightness perception. The results of some previous studies support this idea. For instance, Motoyoshi et al. (2007) investigated the relationship between low-order image features and surface quality perceptions such as lightness and glossiness. They showed that lightness perception decreased, but glossiness perception increased as the luminance skewness increased while maintaining the mean luminance constant. These opposing trends between glossiness and lightness perceptions support the idea that perceptual lightness decreases when specular highlights are detected. 
Here, we attempt to quantify the degree of highlight exclusion for lightness perception from the current experimental results based on the assumption that the highlight is disregarded for lightness perception as follows. The amount of reflected light was the same across our stimuli because of fixed specular and diffuse reflectance. However, the perceived lightness on the glossy surfaces should be lower than the simple expectation based on the mean luminance due to the exclusion of luminance in the highlight regions. Therefore, we regarded the matched reflectance predicted by the mean luminance for stimuli without specular highlights as standard lightness to quantify the highlight effects. Specifically, we used the results for stimuli with a depth coefficient of zero (hereafter referred to as standard stimuli) to define standard lightness because they did not contain local specular highlights on the plate surfaces. For other test stimuli, the difference in the matched reflectance from the standard lightness was considered an index for highlight exclusion. 
Four standard stimuli with different roughness and mean luminance were used in each environment map. In Figure 7, the red and blue plots show standard stimuli and other stimuli, respectively. Linear regression was performed to the matched reflectance of the standard stimuli as a function of the mean luminance to estimate the standard lightness for a wide range of mean luminance. The regression lines are shown as black lines in Figure 7, and the regression and determination coefficients are listed in Table 3. The slope depended on the illumination map because it reflected the relationship between the matched reflectance and stimulus luminance (e.g., stimulus luminance was strongly influenced by illuminant intensity even when the reflectance was fixed). First, the slopes of the regression lines were positive; that is, the matched reflectance of the standard stimuli increased with mean luminance, as expected. Additionally, the regression lines represent the trend for the standard stimuli well, as indicated by the high determination coefficients. Furthermore, most of the blue plots are below the regression line, which is in line with the idea that highlight decreases the matched reflectance by ignoring the highlight components in the lightness estimation. Therefore, we defined the difference between the standard lightness represented by the regression line and the matched reflectance (i.e., the distance between the blue plot and the regression line in the vertical direction in Figure 7) as the HEI. Note that we did not use the results for the diffuse test stimuli to obtain standard lightness because the mean luminance of the diffuse test stimuli was much lower than that of the plastic test stimuli. 
Table 3.
 
Linear regression formula and determination coefficient for each environment map. x represents the mean of common logarithm of luminance (cd/m2), and the objective variable is the common logarithm of matched reflectance.
Table 3.
 
Linear regression formula and determination coefficient for each environment map. x represents the mean of common logarithm of luminance (cd/m2), and the objective variable is the common logarithm of matched reflectance.
The individual observers’ data were merged to calculate the HEIs as follows. First, the matched reflectance was averaged across observers for each stimulus. HEIs were then calculated based on the averaged matched reflectance. We resampled the response data across repetitions and observers to evaluate the 95% confidence intervals. This means that intra- and interobserver variations were reflected in the 95% confidence interval and statistical testing. 
The validity of the HEI should be verified before its use. Honson et al. (2020) reported that the effects of highlight exclusion on perceived lightness decreased as the roughness of a bumpy stimulus plate increased. If the HEI correctly reflects the impact of highlight exclusion, a similar trend should also be observed in the HEI. The relationship between the roughness and HEI is shown in Figure 8. The index decreased with increasing micro-roughness, similar to the results of Honson et al. (2020). A bootstrap test with 10,000 repetitions revealed that the slope of the linear regression line for the four data points was significantly negative (p < 0.05). This finding can be interpreted to mean that the lower roughness (i.e., smooth) of the surface made the specular highlights perceptually strong and increased the HEI. Additionally, the HEIs seem relevant to perceived glossiness in terms of image statistics. Figures 9a and b show examples of low and high HEI stimuli, respectively. The high HEI stimulus has clear specular highlights and thus seems glossy. Because glossiness perception correlates with simple image features such as luminance skewness (e.g., Motoyoshi et al., 2007), HEI may exhibit similar relationships with simple image features. To verify this relationship, we calculated the correlation coefficients between the HEI and image statistics that correlated with perceived glossiness (luminance contrast and luminance skewness) (Motoyoshi et al., 2007; Wiebel et al., 2015). The coefficients were 0.66 and 0.26 for contrast and skewness, respectively, indicating that the lower order image statistics relevant to glossiness perception may also be relevant to HEIs. 
Figure 8.
 
Relationship between roughness and highlight exclusion index. The errors bars indicate standard errors of mean.
Figure 8.
 
Relationship between roughness and highlight exclusion index. The errors bars indicate standard errors of mean.
Figure 9.
 
Examples of (a) low and (b) high HEI stimuli (HEI was 0.00102 and 0.0946, respectively).
Figure 9.
 
Examples of (a) low and (b) high HEI stimuli (HEI was 0.00102 and 0.0946, respectively).
Based on these results, HEI is considered to reflect the effects of highlights on lightness perception, at least to some extent. In the following subsection, we examine the relationship between HEI and PS statistics. 
Relationship between HEI and PS statistics
We applied the contrast sensitivity properties of the human visual system to the luminance of the test stimulus as preprocessing before calculating the PS statistics. The visual system exhibits a band-pass spatial frequency contrast sensitivity function (CSF) to luminance (Kelly, 1983; Mullen, 1985). Some properties of the CSF are derived from the cone distribution on the retina and ocular optical properties (Cottaris, Jiang, Ding, Wandell, & Brainard, 2019). Therefore, visual information reaching V1 is partially influenced by contrast sensitivity properties. To simulate the visual information sent to V1, we applied a digital filter with a spatial frequency response similar to that of the CSF for the test stimulus luminance. We used the MATLAB program (https://jp.mathworks.com/matlabcentral/fileexchange/68784-contrast-sensitivity-function-barten-s-model) to calculate the CSF model proposed by Barten (2003). The central region of the digital filter output image was cropped to 640 pixels in height and width. 
Then, PS statistics were calculated from the filtered images using the MATLAB program (http://www.cns.nyu.edu/∼lcv/texture/). The parameters were a scale of 4, orientation of 4, and number of neighboring pixels of 9, resulting in 1,064 image features. We categorized the PS statistics into four subsets. According to a previous study (Okazawa et al., 2014), there are seven classes of PS statistics. 
  • Marginal, consisting of the mean, standard deviation, skewness, and kurtosis of luminance per scale, and so on;
  • Linear cross-position, consisting of correlations between the spatial positions of linearly filtered images;
  • Energy cross-position, consisting of correlations between the spatial positions of energy-filtered images;
  • Energy cross-scale, consisting of correlations between scales of energy-filtered images;
  • Energy cross-orientation, consisting of correlations between the orientations of energy-filtered images;
  • Spectral, consisting of the amplitudes of particular spatial frequency/orientation sub-bands; and
  • Linear cross-scale, consisting of the correlations between the linear filter image and its coarse image, whose phase angle was doubled.
However, the four subsets—energy cross-position, energy cross-scale, energy cross-orientation, and spectral–cannot be synthesized separately in the MATLAB program used for image synthesis using PS statistics (http://www.cns.nyu.edu/∼lcv/texture/). Thus, we categorized them together into a subset called Energy stats
Here, we aimed to quantify the correlation between the HEI and each PS statistic subset. Partial least squares (PLS) regression was performed with HEI as the objective variable and each PS statistic subset as the explanatory variable. All variables were standardized before regression. The PLS regression linearly transforms the explanatory variables into n-dimensional orthogonal latent variables to explain the maximum (multidimensional) variance of the objective variable. This process combines explanatory variables with collinearity as a common latent variable. Additionally, because the latent variables are orthogonal to each other, we do not need to consider the collinearity problem between them in the regression. Thus, we could obtain relatively appropriate regression results even if the number of explanatory variables was greater than that of the objective variables and the explanatory variables had collinearity. In PLS regression, the number of latent variables n is a hyperparameter. We searched for n that could suppress overfitting using 16-fold cross-validation. The determination coefficient of the objective variable at each n was calculated by averaging the results of the 16 repetitions in the cross-validation. The cross-validation procedure was performed once. The maximum determination coefficient, within the range of n = 1–17, was used as the correlation index. In addition, PLS regression, in which all PS statistics were used as explanatory variables, was also performed for comparison. 
Figure 10 shows the PLS regression results. Figure 10a shows the determination coefficient R2 as a function of n for all the PS statistic subsets. The determination coefficient was maximum at small ns, except for the linear cross-scale. The maximum determination coefficient for each PS statistic subset is shown in Figure 10b. They were 0.5 to 0.6 for all PS statistic subsets. Comparable determination coefficients of PLS regression across all PS statistic subsets indicate that every PS statistic subset has a sufficiently high correlation with the HEI. However, we cannot conclude that all of them provide cues for highlight exclusion by the human visual system because our results show only the correlation relationships. 
Figure 10.
 
Results of PLS regression. (a) determination coefficient as a function of number of PLS components n. (b) Maximum determination coefficient across ns.
Figure 10.
 
Results of PLS regression. (a) determination coefficient as a function of number of PLS components n. (b) Maximum determination coefficient across ns.
Discussion
In Experiment 1, lightness perception was measured using computer graphic object images. The matched reflectance (perceived lightness) was explained largely by the mean luminance of the stimulus. However, the residual components in perceived lightness that could not be explained by mean luminance were suspected to be induced by highlight exclusion. Therefore, we quantified the magnitude of highlight exclusion in lightness perception as the deviation of the matched reflectance results from the prediction based on the mean luminance and then calculated its correlation with each PS statistic subset. The results indicate that the correlations were moderately high for all PS statistic subsets, suggesting that all PS statistic subsets are candidates for highlight exclusion cues. 
However, it should be noted that Experiment 1 did not provide evidence that all PS statistic subsets were cues for highlight exclusion. First, the fact that regression based on the marginal subset showed a reasonable accuracy is consistent with previous studies that showed the impact of low-order image features on lightness perception (Sharan et al., 2008). Contrastingly, the correlation in the other PS statistic subsets is difficult to interpret. For instance, the correlation between the other three PS statistic subsets and the HEI may stem from the covariance between the PS statistic subsets if the marginal subset contributes to lightness perception. 
Minor concerns exist regarding the definition of HEI. We used a flat plastic plate with a depth coefficient of zero as the standard to calculate the HEI. However, there were two concerns regarding this choice. The first concern is that HEI may include the effects of shape-dependent modulations on perceived lightness. The perceived lightness of objects made of the same material depends on their shapes (Schmid & Anderson, 2014). Thus, the HEI based on flat plates may have reflected the change in perceived lightness due to the shape difference, at least partly. Comparing the lightness perception between flat and bumpy diffuse plates with equal mean luminance may help compensate for this possible artifact effect. However, the mean luminance of the bumpy diffuse plates was much smaller than that of the plastic plates in Experiment 1; thus, such a comparison was impossible. Another concern is the presence of specular reflections on the standard stimuli. Even though the luminance patterns on the standard stimuli (flat plastic plates) were almost uniform, the possibility that the visual system excluded specular reflection components from the perceived lightness cannot be ruled out. Therefore, adopting a flat diffuse plate without specular reflection as the standard may be more appropriate. 
Experiment 2: Effects of PS statistics manipulation on lightness perception
Although Experiment 1 showed that all PS statistic subsets are candidate cues for highlight exclusion, it remains unclear whether the visual system relies on them. Therefore, Experiment 2 aimed to identify PS statistic subsets that acted as cues for highlight exclusion based on image manipulation experimentally. The following steps were performed. 
  • 1. PS statistics were extracted from the test stimuli in Experiment 1.
  • 2. Images that retained some PS statistic subsets while randomizing others were synthesized.
  • 3. We measured the HEIs for the generated images using the same procedure as that in Experiment 1.
In this experiment, if the subset randomized in the image synthesis is irrelevant for highlight exclusion, the HEI for the image should be retained. In contrast, if the randomized subset contains image features that are crucial for highlight exclusion, the HEI for the image should also be randomized. 
Here, we specifically focus on the PS statistic subset necessary to highlight exclusion. Low-order image features are sometimes correlated with surface quality perceptions (e.g., Motoyoshi et al., 2007; Sharan et al., 2008; Wiebel et al., 2015). However, later studies pointed out that low-order image features are insufficient to explain the properties of surface quality perception (Anderson & Kim, 2009; Marlow et al., 2011). For instance, they demonstrated the significant impact of highlight–shading spatial congruence on perceived glossiness. However, image features that capture such spatial congruence have not yet been identified. For instance, PS statistics could be involved because they contain higher order features than simple pixel-based moment statistics. 
Methods
Apparatus
The same apparatus as in Experiment 1 was used. 
Observers
Six undergraduate and graduate students from Tokyo Institute of Technology participated in Experiment 2, one of whom was the author H.N. All participants were males in their 20s and had a visual acuity or corrected visual acuity of 0.4 or better, as assessed using the Landolt ring chart. Four of the six observers also participated in Experiment 1. This study was approved by the Ethical Review Committee of Tokyo Institute of Technology in accordance with the Declaration of Helsinki. The observers were informed about the experiment in advance and written informed consent was obtained. 
Stimulus: Overview
The spatial layout of the stimuli was nearly identical to that of Experiment 1 (Figure 1). Because the test stimuli in Experiment 2 were created through image synthesis, the use of multiple environment maps was less important. Thus, we adopted only the environment map “photo,” which tended to create naturalistic highlights on object surfaces. 
Stimulus: Reference stimulus
The reference stimuli were the same as those used in the environment map photo of Experiment 1, except that the range of their diffuse reflectance was wider than that in Experiment 1—from 0.042 to 0.59 in 51 steps, uniform on the logarithmic scale. We used this wider range to make it possible to measure the lightness of stimuli with a low mean luminance. 
Stimulus: Overview of image synthesis for test stimulus
The image synthesis procedures used to create the test images are summarized in Figure 11. Test stimulus images were created using a MATLAB program for image synthesis based the on the PS statistics (http://www.cns.nyu.edu/∼lcv/texture/). The steps toward the image synthesis were as follows. 
Figure 11.
 
Stimulus creation procedure in Experiment 2. In this example, the target image was selected from high HEI stimuli in Experiment 1, and marginal statistic subset were randomized. Also, this example shows how to create one-subset-randomized stimuli.
Figure 11.
 
Stimulus creation procedure in Experiment 2. In this example, the target image was selected from high HEI stimuli in Experiment 1, and marginal statistic subset were randomized. Also, this example shows how to create one-subset-randomized stimuli.
  • Target image selection
We selected 20 images from which the target PS statistics for image synthesis were obtained considering their HEIs measured in Experiment 1, as follows. There were 256 bumpy plastic plates for which the HEIs were obtained in Experiment 1. These stimulus images were divided into two groups based on their HEIs: one group contained 64 images with the highest 25% HEI and the other contained 64 images with the lowest 25% HEI. Then, two images, each of which was selected from each group, made a pair. Ten pairs were randomly created under the following constraints. 
  • Two pairs different only in environment map
  • Two pairs different only in frequency
  • Two pairs different only in depth coefficient
  • Two pairs different only in roughness
  • Two randomly selected pairs
  • Twenty stimulus images in these 10 pairs were different from each other.
These constraints were adopted to examine the effect of changes in various physical features on PS statistics. The parameters of the ten image pairs are listed in Table 4. These images are referred to as target images
Table 4.
 
Stimulus parameters of target images.
Table 4.
 
Stimulus parameters of target images.
  • PS statistics calculation
The PS statistics were calculated for the central 640 × 640 pixels of each target image, as in Experiment 1. Hyperparameters for PS statistics analysis and image synthesis based on PS statistics were the same as in Experiment 1: scale 4, orientation 4, and neighbor 9. In addition, the calculations of PS statistics and image synthesis were performed for common logarithmic luminance rather than for linear luminance. This was because the luminance of the synthesized image could be negative if the image synthesis was performed for linear luminance. 
  • Image synthesis
Images were synthesized from the initial images toward the target PS statistics. The synthesis algorithm asymptotically creates an image with the target PS statistics by iteratively modulating the initial image. In the image synthesis option settings, only some of the four PS statistic subsets were specified for synthesis, whereas others were not. The initial images and specified statistic subset depended on the conditions and are described in the following subsection. The number of iterations was set to 50 in this case, as a previous study reported that this algorithm converges well after 50 iterations in most cases (Portilla & Simoncelli, 2000). The synthesized images were 640 × 640 pixels (7.8° × 7.8°) in size. The images synthesized toward high and low HEI targets are referred to as high and low HEI synthesized images, respectively. 
  • Tone mapping
Finally, a tone mapping operator was applied to the luminance of the synthesized images because some exceeded the color gamut of the display. The tone mapping operator is shown in Equations (2) and (3).  
\begin{eqnarray} && y^{\prime} = {\log _{10}}70 - \left( {{{\log }_{10}}70 - {{\log }_{10}}60} \right)\nonumber \\ && \qquad\times\,\exp \left( {\frac{{{{\log }_{10}}Y + {{\log }_{10}}70}}{{{{\log }_{10}}70 - {{\log }_{10}}60}}} \right),\quad \end{eqnarray}
(2)
 
\begin{equation}Y^{\prime} = \max \left( {{{10}^{y^{\prime}}},{\rm{\ }}0.5} \right),\end{equation}
(3)
where Y in Equation (2) is the luminance before the tone map and Y' in Equation (3) is the tone mapped luminance. By applying this tone map operator, luminance below 0.5 cd/m2 was rounded up to 0.5 cd/m2, and that above 60 cd/m2 was exponentially rounded so that the maximum value became 70 cd/m2, as shown in Figure 12. Low and high luminance tone mappings were applied to 4 and 15 images, respectively. We also confirmed that the effects of tone mapping on the PS statistics were negligible. 
Figure 12.
 
Tone mapping operator represented by Equations (2) and (3). Luminance on the horizontal and vertical axes are shown in the logarithm scale.
Figure 12.
 
Tone mapping operator represented by Equations (2) and (3). Luminance on the horizontal and vertical axes are shown in the logarithm scale.
Stimulus: One-subset-randomized image
There were two types of synthesized test stimuli with different randomized subsets: one-subset-randomized and one-subset-target images. One-subset-randomized images were used to explore PS statistic subsets that do not act as a necessary condition for highlight exclusion. The initial image for image synthesis was a white noise image whose mean and variance of luminance were the same as those of the synthesis target. Only three of the four PS statistic subsets were specified for synthesis, whereas the other was not. For example, a marginal–randomized image was created by not specifying the marginal subset to be synthesized. Eighty synthesized images were created by combining the 4 randomized subsets with 20 target images. Some of the one-subset-randomized images are shown in Figure 13. The differences in material impressions between the four randomization conditions in Figure 13 seems subtle. However, because the stimulus size in the experiment was much larger (approximately 7.8° in height and width), the differences in the image impressions were more clearly perceived by the observers. For instance, the specular-like impressions tended to be weaker for marginal randomized stimuli. 
Figure 13.
 
Examples of one-subset-randomized images. The rows show the high- and low-HEI synthesized images, and the columns show the randomized PS statistic subsets. It should be noted that the stimulus size was much larger in the experiment (7.8 × 7.8 degrees), leading to surface quality impressions somewhat different from the images shown here.
Figure 13.
 
Examples of one-subset-randomized images. The rows show the high- and low-HEI synthesized images, and the columns show the randomized PS statistic subsets. It should be noted that the stimulus size was much larger in the experiment (7.8 × 7.8 degrees), leading to surface quality impressions somewhat different from the images shown here.
The correlation between the PS statistics of the target and synthesized images was reviewed to evaluate the convergence of image synthesis. Under all conditions, the PS statistic subsets specified for the synthesis exhibited a correlation coefficient close to 1, confirming sufficient convergence. In contrast, the correlation coefficients for the not-specified (randomized) subset ranged from 0.50 to 0.95, confirming some randomization of the subset. These correlations are presented in Supplementary Material S3
Stimulus: One-subset-target image
One-subset-target synthesized images were used to confirm that the marginal subset acts as a necessary condition for highlight exclusion. In other words, we examined whether the marginal subset alone could modulate the highlight exclusion performance. In this image synthesis, only the marginal subset was specified for synthesis, whereas the other three subsets were randomized. The initial images were those of the computer-graphic images of bumpy plastic in Experiment 1, not white noise images, because most images synthesized from white-noise images did not appear as object images when only the marginal subset was specified for synthesis. The initial images were randomly selected from the stimulus images in Experiment 1: one from the images with the highest 25% HEI and one from the lowest 25% HEI, both of which were different from the 20 target images in Table 4. The two initial images are shown in Figure 14. Through image synthesis, 40 one-subset-target images were created (2 initial images × 20 target images). Some of the synthesized images are shown in Figure 15. To evaluate the convergence of image synthesis, the correlation coefficients of the PS statistics between the target and synthesized images were reviewed. The correlation coefficients for the marginal subset ranged from 0.75 to 0.96, indicating approximate convergence. In contrast, those of the other subsets ranged from 0.11 to 0.68, indicating moderate randomization. These correlations are presented in Supplementary Material S3
Figure 14.
 
Initial images used to synthesize one-subset-target images. The left and right images were randomly selected from the lowest and highest 25% of the HEI in Experiment 1, respectively.
Figure 14.
 
Initial images used to synthesize one-subset-target images. The left and right images were randomly selected from the lowest and highest 25% of the HEI in Experiment 1, respectively.
Figure 15.
 
Examples of one-subset-target synthesized images. The rows correspond to the high- and low-HEI synthesized images. The columns show the initial computer-graphics images with the high and low HEIs.
Figure 15.
 
Examples of one-subset-target synthesized images. The rows correspond to the high- and low-HEI synthesized images. The columns show the initial computer-graphics images with the high and low HEIs.
Some of the one-subset-randomized and one-subset-target images seemed to have nonuniform diffuse reflectance, as in some images in Figures 13 and 15. Because all images contained luminance patterns, the bright regions on the images may have been perceived as either high-reflectance regions or specular highlights. The former perception, which occurs when the bright regions do not seem like specular highlights, is considered to lead to the impression of surfaces with nonuniform diffuse reflectance. This appearance was taken into account in the instructions given to the observers, as described elsewhere in this article. 
Stimulus: New standard stimuli for HEI calculation
Additional images were created as standard images to calculate the HEIs. These additional images were four flat and four bumpy plates with Lambertian reflections (eight images in total), resulting in images of diffuse plates without specular reflections. The mean luminance of each of the flat and bumpy plate images was evenly distributed in four steps on a logarithmic scale from 3.16 to 12.59 cd⁄m2. Images were created as follows. 
  • Two images of flat and bumpy diffuse plates were rendered with a diffuse reflectance of 0.1 under the photo environment map. The shape of the bumpy plate was set to a frequency of 1 and a depth coefficient of 8. These were used as the basement images.
  • All the basement images were trimmed to the same size as the synthesized images.
  • The luminance of the trimmed basement images was multiplied by a coefficient to match the desired mean luminance.
We created these new standard stimuli to increase the validity of the HEIs. The diffuse plates should decrease the effects of residual specular highlights differently from the flat plastic plates used as standard stimuli in Experiment 1. Additionally, we checked the effects of shapes on the perceived lightness by comparing the flat and bumpy diffuse plates. However, we confirmed from the experimental results that the matched reflectances for the flat and bumpy diffuse plates and the flat plastic plates in Experiment 1 were almost the same. This finding indicated that all three types of standard stimuli were comparable in their eligibility for standard stimuli for HEI calculations. Therefore, we decided to use flat plastic plates as standard stimuli for HEI calculations to facilitate a comparison with Experiment 1. A comparison of the three types of standard stimuli is provided in Supplementary Material S4
Procedure
The procedure to measure perceived lightness was the same as that in Experiment 1, except for the detailed instructions and session structure. As stated elsewhere in this article, the stimuli sometimes seemed to be nonuniform in lightness. Therefore, the observers were instructed to judge the overall impressions of surface lightness, even if they seemed spatially variegated in lightness. This criterion was similar to ensemble perception. For example, ensemble color perception is approximately based on the mean color of the stimulus (e.g., Maule, Witzel, & Franklin, 2014). Therefore, HEIs defined based on mean luminance are also considered valid for observer responses to some extent to stimuli with nonuniform lightness. 
In the experiment, 128 test stimuli (80 one-subset-randomized, 40 one-subset-target, 4 flat diffuse, and 4 bumpy diffuse images) were presented twice in a random order, resulting in 256 trials. Each session included 32 trials and practice trials in which random stimuli were presented for the first minute. Thus, there were eight sessions: all stimulus images were evaluated in the first four sessions, and all images were evaluated again in the second four sessions. As in Experiment 1, the observing eye was switched from left to right in each session; one-half of the observers used the left eye, and the other one-half used the right eye in the first session. All observers performed four sessions per day, taking breaks between sessions as needed. 
Results and discussion
The matched reflectances of the 80 one-subset-randomized stimuli are shown in Figure 16. For the high HEI synthesized images, the match reflectance tended to be higher in the marginal randomization condition than in the other conditions. Contrastingly, for the low HEI synthesized images, the matched reflectance was comparable across randomization conditions. A three-way analysis of variance, whose factors were stimulus pair, randomized statistics, and target HEI (low or high), showed that the interaction between the randomized statistics and target HEI was significant, F(3, 959) = 9.65, η2  =  0.011, p < 0.001. These results raise the possibility that the impact of statistics randomization on highlight detection differs across statistic subsets for high-HEI synthesized images. 
Figure 16.
 
Matched reflectance of one-subset-randomized stimuli.
Figure 16.
 
Matched reflectance of one-subset-randomized stimuli.
The matched reflectances of the 20 one-subset-target stimuli are shown in Figure 17. Overall, the matched reflectance of the high-HEI synthesized images tends to be lower than that of the low-HEI synthesized images. A three-way analysis of variance, whose factors were stimulus pair, target HEI, and base image, confirmed the significant main effect of target HEI, F(1, 479)  =  42.22, η2  =  0.04, p  <  0.001. This trend is consistent with the notion that highlights were excluded in lightness perception on the high HEI synthesized images, leading to decreased perceived lightness. 
Figure 17.
 
Matched reflectance of one-subset-target stimuli.
Figure 17.
 
Matched reflectance of one-subset-target stimuli.
The HEIs were calculated for each stimulus image in the same manner as in Experiment 1. The HEIs for the one-subset-randomized images are shown in Figure 18. Figure 18a shows the results for the target images measured in Experiment 1. The HEIs were naturally higher for the high HEI target images than for the low HEI images. Figure 18b shows the HEIs for the synthesized images in which either of the four subsets was randomized. The HEIs were higher for the high HEI synthesized images than for the low HEI ones in conditions other than the marginal randomized condition, although this tendency seemed to disappear in the marginal randomized condition. We statistically tested the significant differences in HEIs between the high- and low-HEI synthesized images in Figure 18c using a nonparametric bootstrap test with 10,000 repetitions in each randomization condition. In this test, the significance levels were adjusted based on the Bonferroni correction. The results showed no significant differences in the marginal randomized condition, whereas the differences were significant in the other three randomized conditions (p < 0.001 for all three conditions). In other words, when one of the three subsets was randomized in the object images, highlights were excluded, probably based on the three maintained subsets. These results suggest that the marginal subset acts as a necessary condition for the highlight to be excluded for lightness perception, whereas the other three PS statistic subsets are not essential for highlight exclusion. In other words, they suggest that the minimum requirement for highlight exclusion is that the marginal subset meets certain states (e.g., the contrast is high). 
Figure 18.
 
(a) Highlight exclusion index for target images measured in Experiment 1. The bar color indicates the high-and low-HEI target images. The circles and bar show the results of the individual ten stimuli and the mean across them, respectively. The error bars are 95% confidence intervals obtained from bootstrapping with 10000 iterations. (b) Highlight exclusion index for one-subset-randomized images. The vertical axis shows the randomized PS statistic subset, and the bar colors indicate the high- and low-HEI synthesized images. (c) Difference in highlight exclusion index between high- and low-HEI synthesized images in (b).
Figure 18.
 
(a) Highlight exclusion index for target images measured in Experiment 1. The bar color indicates the high-and low-HEI target images. The circles and bar show the results of the individual ten stimuli and the mean across them, respectively. The error bars are 95% confidence intervals obtained from bootstrapping with 10000 iterations. (b) Highlight exclusion index for one-subset-randomized images. The vertical axis shows the randomized PS statistic subset, and the bar colors indicate the high- and low-HEI synthesized images. (c) Difference in highlight exclusion index between high- and low-HEI synthesized images in (b).
HEIs were also calculated for the one-subset-target images. The results are shown in Figure 19. The vertical axis shows the initial images: high and low HEI–based initial images and ‘All,’ which shows merged results across them. In all the initial images, the HEIs were higher in the high HEI images than in the low HEI images. We tested the statistical differences in the indices between the high and low HEI images using a bootstrap procedure with 10,000 repetitions combined with the Bonferroni correction. There was a significant difference in all the initial images (p < 0.001 for all panels), indicating that the visual system can exclude highlights for lightness perception even if the PS statistic subsets other than the marginal subset were randomized. 
Figure 19.
 
Highlight exclusion index for one-subset-target images with subsets other than marginal subsets randomized. ‘All’ on the vertical axis shows the merged results of high- and low-HEI synthesized images.
Figure 19.
 
Highlight exclusion index for one-subset-target images with subsets other than marginal subsets randomized. ‘All’ on the vertical axis shows the merged results of high- and low-HEI synthesized images.
A concern in our results is that perceived object shapes, not highlight exclusion, modulated HEIs. Perceived shapes alter lightness perception (Knill & Kersten, 1991) and glossiness perception (Marlow, Todorović & Anderson, 2015). Knill and Kersten (1991) demonstrated that the same luminance pattern induces different degrees of perceived lightness by changing shape perception using the surrounding contour. Similarly, Marlow et al. (2015) showed that changing the shape perception of an identical luminance pattern based on its contour affects whether it appears metallic or matte. In Experiment 2, the shape perception of the test stimulus may have been more ambiguous than in Experiment 1 because they were synthesized images in Experiment 2. Therefore, this ambiguous shape perception is suspected to alter lightness perception. However, the changes in perceived shapes in these two previous studies were much more significant than those in the current study. Thus, the effect of shape perception, if any, in Experiment 2 was considered subtle. 
General discussion
This study aimed to elucidate the PS statistic subsets that provide clues for highlight exclusion in lightness perception. In Experiment 1, we measured the perceived lightness of rendered object images and calculated the HEI, which was defined to represent how perceived lightness decreased from the prediction based on the stimulus mean luminance. After the experiment, the correlation coefficients between the HEI and each PS statistic subset were calculated. Results indicated that all PS statistic subsets correlated well with the HEIs, suggesting that all subsets are candidates to provide cues for highlight exclusion. In Experiment 2, we created synthesized images in which PS statistic subsets were manipulated. The results showed that randomization of the marginal subset in image synthesis strongly randomized the HEIs, whereas randomization of the other subsets maintained the differences in HEIs between the images synthesized toward the high and low HEI targets. These results suggest that the marginal subset is a necessary condition for highlight exclusion, whereas the other quasi-low-order subsets—linear cross-position, energy stats, and linear cross-scale—may not provide direct cues for highlight exclusion. 
This study has two main findings: 1) it reconfirms that simple image features can affect lightness judgment, and 2) it provides new insights into the relationship between PS statistics and highlight exclusion. Several previous studies have reported that important image cues for glossiness perception are simple image features, such as luminance contrast (Wiebel et al., 2015) and skewness of luminance histograms (Motoyoshi et al., 2007), on object surfaces. Our suggestion that the marginal subset acts as a necessary condition for highlight exclusion is consistent with these studies. Furthermore, we quantitatively evaluated the effects of manipulations of more complex (quasi-low-order) image features other than the marginal ones, whereas previous studies did not. Thus, this study not only reconfirmed the importance of simple image features in highlight exclusion but also suggested that the quasi-low-order PS statistic subsets do not act as a necessary condition for highlight exclusion. 
It should be emphasized that the current results do not negate the role of higher order image features in highlight exclusion; rather, there must be higher order features than PS statistics that are crucial for highlight exclusion. Our results suggest that the three quasi-low-order PS statistic subsets other than the marginal subset are not essential to highlight exclusion. However, this does not contradict the previous reports on the importance of complex image features in highlight exclusion (Anderson & Kim, 2009; Kim et al., 2011; Marlow et al., 2011; Sawayama & Nishida, 2018) because even simple object properties, such as object contours, are difficult to capture using PS statistics (Portilla & Simoncelli, 2000). Recently, convolutional neural network models that emulate human highlight detection have been proposed, and they have been shown to represent complex image features, such as photo-geometrical constraints (Prokott & Fleming, 2022). Additionally, physiological studies of gloss-selective cells in the monkey inferior temporal cortex (Nishio, Goda, & Komatsu, 2012; Nishio, Shimokawa, Goda, & Komatsu, 2014) also support the importance of image features that are more complex than the PS statistics corresponding with V1 and V2. In our experiment, if some information from a PS statistic subset was missing due to the randomization process, the visual system could still rely on information from other PS statistic subsets to compensate for the missing information for highlight exclusion. In other words, the three PS statistic subsets, other than the marginal subset are highly likely to be included in the sufficient condition for highlight exclusion. Therefore, the importance of higher order image features, such as the highlights–shading congruence and luminance gradient magnitude, has not been ruled out by our results. 
What are the differences between the marginal subsets and other quasi-low-order statistics? The marginal subset comprises global and simple image features, such as pixel-based contrast and skewness. Thus, randomizing them could have affected the higher-order features essential for highlight detection (e.g., pixel-based rms contrast affects the perceptual magnitude of luminance texture patterns). Therefore, we consider that marginal acted as a necessary condition for highlight exclusions in our experiment, even if higher-order features are essentially crucial. By contrast, the other three quasi-low-order subsets—linear cross-position, energy stats, and linear cross-scale—represent local image features, such as orientation and scale cross-correlations. The fact that neither of these three subsets was necessary for highlight exclusion in our experiments raises the possibility that the integrated information gathered from all three subsets in higher order visual processing may represent specular highlight-ness. In other words, each subset does not directly represent specular highlights but contains just a small part of highlight information, if any. 
Additionally, the PS statistics do not represent all the image features represented in V1 and V2. One such example is border ownership. A texture with an open circle and that with a closed circle can have equal PS statistics (Portilla & Simoncelli, 2000). This property indicates that the PS statistics do not represent border ownership. In contrast, V1 and V2 neurons encode border ownership information (Zhou, Friedman, & von der Heydt, 2000; Layton, Mingolla, & Yazdanbakhsh, 2012). Because specular highlights typically exhibit closed luminance edges, the visual system may use border ownership information to identify specular highlights. In our results, the decrease in HEI for high roughness stimuli may be related to the difficulty in this border ownership performance owing to blurred luminance edges. However, our experiments, which depend on PS statistics, could not test this hypothesis. 
The generalizability of our results, particularly regarding the effects of environment maps, should also be discussed. We chose the environment maps for stimulus rendering so that they cover different aspects of features in environment maps that affect material impressions (e.g., Zhang et al., 2019). Nevertheless, the overall trend of the matched reflectance was similar across all environment maps, as shown in Figure 7. Additionally, the results of Experiment 2, where the image synthesis targets were CG images rendered under different environment maps, were also similar across stimuli. We found no statistical significance in the main effect and interactions of the stimulus pair, reflecting differences in environment maps. Thus, we speculate that similar conclusions can be drawn for various environment maps. However, it may be necessary to confirm the generalizability of our results more widely by using extreme environment maps, where, for instance, illuminants are highly localized. 
Finally, more sophisticated approaches that manipulate a myriad of image features are essential to elucidate the mechanisms of highlight exclusion. This study examined the mechanisms of highlight exclusion by focusing on low- and quasi-low-order image features represented in PS statistics. However, because the image features in PS statistics correspond with relatively lower order processing in the visual system for material perception, it limits understanding the mechanisms of material perception by simply relying on PS statistics. Instead, deep learning models, for instance, can be utilized to investigate higher order image features than ours (e.g., Storrs et al., 2021). In addition, binocular disparity (Marlow et al., 2012), eye movements (Toscani, Yücel, & Doerschner, 2019), and object motion (Doerschner et al., 2011) all modulate material perception. Large models incorporating all these factors may be necessary to account for the properties of human material perception. 
Conclusions
In this study, we investigated the relationship between the subsets of PS statistics and highlight exclusion in lightness perception on glossy plates. In the first experiment, we found that all types of PS statistics were correlated with the magnitude of highlight exclusion for lightness perception. Contrastingly, in the second experiment, on the synthesized images in which the PS statistic subset was manipulated, only the manipulation of lower level features (the marginal subset) strongly affected highlight exclusion performance. These results suggest that lower order features are crucial for highlight exclusion, whereas other quasi-low-order features, such as energy stats, are not necessary. Higher order image features than those in PS statistics seem likely to be involved in highlight exclusion. 
Acknowledgments
Supported by JSPS KAKENHI Grant Numbers JP19H04197 and 21H05810 to TN. 
Commercial relationships: none. 
Corresponding author: Takehiro Nagai. 
Email: nagai.t.aa@m.titech.ac.jp. 
Address: Department of Information and Communications Engineering, School of Engineering, Tokyo Institute of Technology, 4259-G2-1 Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa 226-8502, Japan. 
References
Adelson E. H. (1993). Perceptual organization and the judgment of brightness. Science, 262(5142), 2042–2044. [PubMed]
Adelson, E. H. (2000) Lightness perception and lightness illusions. In Gazzaniga, M. (Ed.), The New Cognitive Neurosciences, 2nd ed. Cambridge, MA: MIT Press.
Anderson, B. L., & Kim, J. (2009). Image statistics do not explain the perception of gloss and lightness. Journal of Vision, 9(11), 10. [PubMed]
Anderson, B., & Winawer, J. (2005). Image segmentation and lightness perception. Nature, 434(7029), 79–83. [PubMed]
Arend, L. E., & Spehar, B. (1993). Lightness, brightness, and brightness contrast: 1. Illuminance variation. Perception & Psychophysics, 54(4), 446–456. [PubMed]
Barten, P. G. (2003). Formula for the contrast sensitivity of the human eye. Proceedings of the SPIE 5294, Image Quality and System Performance 5294, 231–238.
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. [PubMed]
Cottaris, N. P., Jiang, H., Ding, X., Wandell, B. A., & Brainard, D. H. (2019). A computational-observer model of spatial contrast sensitivity: Effects of wave-front-based optics, cone-mosaic structure, and inference engine. Journal of Vision, 19(4), 8. [PubMed]
Doerschner, K., Fleming, R., Yilmaz, O., Schrater, P., Hartung, B., & Kersten, D. (2011). Visual motion and the perception of surface material. Current Biology, 21(23), 2010–2016.
Fleming, R. W. (2014). Visual perception of materials and their properties. Vision Research, 94, 62–75. [PubMed]
Freeman, J., & Simoncelli, E. P. (2011). Metamers of the ventral stream. Nature Neuroscience, 14(9), 1195–1201. [PubMed]
Freeman, J., Ziemba, C. M., Heeger, D. J., Simoncelli, E. P., & Movshon, J. A. (2013, July). A functional and perceptual signature of the second visual area in primates. Nature Neuroscience, 16(7), 974–981. [PubMed]
Gulrajani, I., Kumar, K., Ahmed, F., Taiga, A. A., Visin, F., Vazquez, D., et al. (2016). PixelVAE: A latent variable model for natural images. arXiv, 1611.05013.
Honson, V., Huynh-Thu, Q., Arnison, M., Monaghan, D., Isherwood, Z. J., & Kim, J. (2020). Effects of shape, roughness and gloss on the perceived reflectance of colored surfaces. Frontiers in Psychology, 11, 485. [PubMed]
Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat visual cortex. Journal of Physiology, 160(1), 106–154.
Jakob, W. (2010). Mitsuba renderer. Available: https://www.mitsuba-renderer.org.
Kelly, D. H. (1983). Spatiotemporal variation of chromatic and achromatic contrast thresholds. Journal of the Optical Society of America, 73(6), 742–750. [PubMed]
Kim, J., Marlow, P., & Anderson, B. L. (2011). The perception of gloss depends on highlight congruence with surface shading. Journal of Vision, 11(9), 4. [PubMed]
Kim, J., Tan, K., & Chowdhury, N. S. (2016). Image statistics and the fine lines of material perception. i-Perception, 7, 204166951665804.
Kim, M., Gold, J., & Murray, R. (2018). What image features guide lightness perception? Journal of Vision, 18(13), 1. [PubMed]
Kleiner, M., Brainard, D. H., & Pelli, D. G. (2007). What's new in Psychtoolbox-3? Perception, 36 ECVP Abstract Supplement.
Knill, D. C., & Kersten, D. (1991). Apparent surface curvature affects lightness perception. Nature, 351(6323), 228–230. [PubMed]
Layton, O., Mingolla, E., & Yazdanbakhsh, A. (2012). Dynamic coding of border-ownership in visual cortex. Journal of Vision, 12(13), 8. [PubMed]
Marlow, P. J., Kim, J., & Anderson, B. L. (2012). The perception and misperception of specular surface reflectance. Current Biology, 22(20), 1909–1913.
Marlow, P., Kim, J., & Anderson, B. (2011). The role of brightness and orientation congruence in the perception of surface gloss. Journal of Vision, 11(9), 16. [PubMed]
Marlow, P. J., Todorović, D., & Anderson, B. L. (2015). Coupled computations of three-dimensional shape and material. Current Biology, 25(6), R221–R222.
Maule, J., Witzel, C., & Franklin, A. (2014). Getting the gist of multiple hues: metric and categorical effects on ensemble perception of hue. Journal of the Optical Society of America A, 31(4), A93–A102.
Motoyoshi, I., Nishida, S., Sharan, L., & Adelson, E. H. (2007). Image statistics and the perception of surface qualities. Nature, 447(7141), 206–209. [PubMed]
Mullen, K. T. (1985). The contrast sensitivity of human colour vision to red-green and blue-yellow chromatic gratings. Journal of Physiology, 359, 381–400.
Murray, R. F. (2020). A model of lightness perception guided by probabilistic assumptions about lighting and reflectance. Journal of Vision, 20(7), 28. [PubMed]
Nishida, S. (2019). Image statistics for material perception. Current Opinion in Behavioral Sciences, 30, 94–99.
Nishio, A., Goda, N., & Komatsu, H. (2012). Neural selectivity and representation of gloss in the monkey inferior temporal cortex. Journal of Neuroscience, 32(31), 10780–10793.
Nishio, A., Shimokawa, T., Goda, N., & Komatsu, H. (2014). Perceptual gloss parameters are encoded by population responses in the monkey inferior temporal cortex. Journal of Neuroscience, 34(33), 11143–11151.
Okazawa, G., Tajima, S., & Komatsu, H. (2014). Image statistics underlying natural texture selectivity of neurons in macaque V4. Proceedings of the National Academy of Sciences of the United States of America, 112(4), E351–E360. [PubMed]
Phuangsuwan, C., Ikeda, M., & Katemake, P. (2013). Color constancy demonstrated in a photographic picture by means of a D-up viewer. Optical Review, 20(1), 74–81.
Portilla, J., & Simoncelli, E. P. (2000). A parametric texture model based on joint statistics of complex wavelet coefficients. International Journal of Computer Vision, 40(1), 49–70.
Prokott, E., & Fleming, R. W. (2022). Identifying specular highlights: Insights from deep learning. Journal of Vision, 22(7), 6. [PubMed]
Prokott, K., Tamura, H., & Fleming, R. (2021). Gloss perception: Searching for a deep neural network that behaves like humans. Journal of Vision, 21(12), 14. [PubMed]
Sawayama, M., & Nishida, S. (2018). Material and shape perception based on two types of intensity gradient information. PLOS Computational Biology, 14, e1006061. [PubMed]
Schmid, A., & Anderson, B. (2014). Do surface reflectance properties and 3-D mesostructure influence the perception of lightness? Journal of Vision, 14(8), 24. [PubMed]
Sharan, L., Li, Y., Motoyoshi, I., Nishida, S., & Adelson, E. H. (2008). Image statistics for surface reflectance perception. Journal of the Optical Society of America A, 25(4), 846–865.
Storrs, K. R., Anderson, B. L., & Fleming, R. W. (2021). Unsupervised learning predicts human perception and misperception of gloss. Nature Human Behaviour, 5(10), 1402–1417. [PubMed]
Todd, J. T., Norman, J. F., & Mingolla, E. (2004). Lightness constancy in the presence of specular highlights. Psychological Science, 15(1), 33–39. [PubMed]
Toscani, M., Valsecchi, M., & Gegenfurtner, K. R. (2013). Optimal sampling of visual information for lightness judgments. Proceedings of the National Academy of Sciences of the United States of America, 110(27), 11163–11168. [PubMed]
Toscani, M., Valsecchi, M., & Gegenfurtner, K. R. (2017). Lightness perception for matte and glossy complex shapes. Vision Research, 131, 82–95. [PubMed]
Toscani, M., Yücel, E., & Doerschner, K. (2019). Gloss and speed judgments yield different fine tuning of saccadic sampling in dynamic scenes. i-Perception, 10, 204166951988907.
Wiebel, C. B., Toscani, M., & Gegenfurtner, K. R. (2015). Statistical correlates of perceived gloss in natural images. Vision Research, 115, 175–187. [PubMed]
Xiao, B., & Brainard, D. (2008). Surface gloss and color perception of 3D objects. Visual Neuroscience, 25, 371–385. [PubMed]
Zhang, F., de Ridder, H., Barla, P., & Pont, S. (2019). A systematic approach to testing and predicting light-material interactions. Journal of Vision, 19(4), 11. [PubMed]
Zhao, S., Song, J., & Ermon, S. (2017). Towards deeper understanding of variational autoencoding models. arXiv, 1702.08658.
Zhou, H., Friedman, H., & Von Der Heydt, R. (2000). Coding of border ownership in monkey visual cortex. Journal of Neuroscience, 20, 6594–6611.
Ziemba, C. M., Freeman, J., Simoncelli, E. P., & Movshon, J. A. (2018). Contextual modulation of sensitivity to naturalistic image structure in macaque V2. Journal of Neurophysiology, 120(2), 409–420. [PubMed]
Figure 1.
 
Flowchart of stimulus creation in Experiment 1.
Figure 1.
 
Flowchart of stimulus creation in Experiment 1.
Figure 2.
 
An example stimulus in a trial. The text and arrows were not presented during the experiment.
Figure 2.
 
An example stimulus in a trial. The text and arrows were not presented during the experiment.
Figure 3.
 
Environment maps. The string below each image indicates its file name.
Figure 3.
 
Environment maps. The string below each image indicates its file name.
Figure 4.
 
(a, b) Two stimulus images in Experiment 1. (c, d) Synthesized images created from a white noise image based on PS statistics of (a) and (b), respectively
Figure 4.
 
(a, b) Two stimulus images in Experiment 1. (c, d) Synthesized images created from a white noise image based on PS statistics of (a) and (b), respectively
Figure 5.
 
Matched diffuse reflectance adjusted by the observers for ‘gamrig’ environment map. The panels correspond with the results of plastic images with different roughness and diffuse images. The error bars show the standard errors of mean (SEMs). The vertical axis is shown in the logarithm scale. The SEMs were also calculated for the logarithm scale of reflectance.
Figure 5.
 
Matched diffuse reflectance adjusted by the observers for ‘gamrig’ environment map. The panels correspond with the results of plastic images with different roughness and diffuse images. The error bars show the standard errors of mean (SEMs). The vertical axis is shown in the logarithm scale. The SEMs were also calculated for the logarithm scale of reflectance.
Figure 6.
 
(a) Matched reflectance and (b) mean luminance of bumpy plastic stimuli as a function of roughness. The values were averaged across depth coefficients and frequencies. The error bars show standard errors of mean across the frequencies and depth coefficients.
Figure 6.
 
(a) Matched reflectance and (b) mean luminance of bumpy plastic stimuli as a function of roughness. The values were averaged across depth coefficients and frequencies. The error bars show standard errors of mean across the frequencies and depth coefficients.
Figure 7.
 
Relationship between mean luminance of plastic stimuli and matched reflectance for every environment map. Red plots indicate the standard stimuli (flat plastic plates), and blue plots indicate other test stimuli. Error bars show standard errors of mean. The black line is a linear regression line for the four black plots. Both the vertical and horizontal lines are shown in the logarithm scale.
Figure 7.
 
Relationship between mean luminance of plastic stimuli and matched reflectance for every environment map. Red plots indicate the standard stimuli (flat plastic plates), and blue plots indicate other test stimuli. Error bars show standard errors of mean. The black line is a linear regression line for the four black plots. Both the vertical and horizontal lines are shown in the logarithm scale.
Figure 8.
 
Relationship between roughness and highlight exclusion index. The errors bars indicate standard errors of mean.
Figure 8.
 
Relationship between roughness and highlight exclusion index. The errors bars indicate standard errors of mean.
Figure 9.
 
Examples of (a) low and (b) high HEI stimuli (HEI was 0.00102 and 0.0946, respectively).
Figure 9.
 
Examples of (a) low and (b) high HEI stimuli (HEI was 0.00102 and 0.0946, respectively).
Figure 10.
 
Results of PLS regression. (a) determination coefficient as a function of number of PLS components n. (b) Maximum determination coefficient across ns.
Figure 10.
 
Results of PLS regression. (a) determination coefficient as a function of number of PLS components n. (b) Maximum determination coefficient across ns.
Figure 11.
 
Stimulus creation procedure in Experiment 2. In this example, the target image was selected from high HEI stimuli in Experiment 1, and marginal statistic subset were randomized. Also, this example shows how to create one-subset-randomized stimuli.
Figure 11.
 
Stimulus creation procedure in Experiment 2. In this example, the target image was selected from high HEI stimuli in Experiment 1, and marginal statistic subset were randomized. Also, this example shows how to create one-subset-randomized stimuli.
Figure 12.
 
Tone mapping operator represented by Equations (2) and (3). Luminance on the horizontal and vertical axes are shown in the logarithm scale.
Figure 12.
 
Tone mapping operator represented by Equations (2) and (3). Luminance on the horizontal and vertical axes are shown in the logarithm scale.
Figure 13.
 
Examples of one-subset-randomized images. The rows show the high- and low-HEI synthesized images, and the columns show the randomized PS statistic subsets. It should be noted that the stimulus size was much larger in the experiment (7.8 × 7.8 degrees), leading to surface quality impressions somewhat different from the images shown here.
Figure 13.
 
Examples of one-subset-randomized images. The rows show the high- and low-HEI synthesized images, and the columns show the randomized PS statistic subsets. It should be noted that the stimulus size was much larger in the experiment (7.8 × 7.8 degrees), leading to surface quality impressions somewhat different from the images shown here.
Figure 14.
 
Initial images used to synthesize one-subset-target images. The left and right images were randomly selected from the lowest and highest 25% of the HEI in Experiment 1, respectively.
Figure 14.
 
Initial images used to synthesize one-subset-target images. The left and right images were randomly selected from the lowest and highest 25% of the HEI in Experiment 1, respectively.
Figure 15.
 
Examples of one-subset-target synthesized images. The rows correspond to the high- and low-HEI synthesized images. The columns show the initial computer-graphics images with the high and low HEIs.
Figure 15.
 
Examples of one-subset-target synthesized images. The rows correspond to the high- and low-HEI synthesized images. The columns show the initial computer-graphics images with the high and low HEIs.
Figure 16.
 
Matched reflectance of one-subset-randomized stimuli.
Figure 16.
 
Matched reflectance of one-subset-randomized stimuli.
Figure 17.
 
Matched reflectance of one-subset-target stimuli.
Figure 17.
 
Matched reflectance of one-subset-target stimuli.
Figure 18.
 
(a) Highlight exclusion index for target images measured in Experiment 1. The bar color indicates the high-and low-HEI target images. The circles and bar show the results of the individual ten stimuli and the mean across them, respectively. The error bars are 95% confidence intervals obtained from bootstrapping with 10000 iterations. (b) Highlight exclusion index for one-subset-randomized images. The vertical axis shows the randomized PS statistic subset, and the bar colors indicate the high- and low-HEI synthesized images. (c) Difference in highlight exclusion index between high- and low-HEI synthesized images in (b).
Figure 18.
 
(a) Highlight exclusion index for target images measured in Experiment 1. The bar color indicates the high-and low-HEI target images. The circles and bar show the results of the individual ten stimuli and the mean across them, respectively. The error bars are 95% confidence intervals obtained from bootstrapping with 10000 iterations. (b) Highlight exclusion index for one-subset-randomized images. The vertical axis shows the randomized PS statistic subset, and the bar colors indicate the high- and low-HEI synthesized images. (c) Difference in highlight exclusion index between high- and low-HEI synthesized images in (b).
Figure 19.
 
Highlight exclusion index for one-subset-target images with subsets other than marginal subsets randomized. ‘All’ on the vertical axis shows the merged results of high- and low-HEI synthesized images.
Figure 19.
 
Highlight exclusion index for one-subset-target images with subsets other than marginal subsets randomized. ‘All’ on the vertical axis shows the merged results of high- and low-HEI synthesized images.
Table 1.
 
Parameters for luminance manipulations on rendered images.
Table 1.
 
Parameters for luminance manipulations on rendered images.
Table 2.
 
Results of three-way analysis of variance for each environment map. Only the conditions with statistical significance are shown.
Table 2.
 
Results of three-way analysis of variance for each environment map. Only the conditions with statistical significance are shown.
Table 3.
 
Linear regression formula and determination coefficient for each environment map. x represents the mean of common logarithm of luminance (cd/m2), and the objective variable is the common logarithm of matched reflectance.
Table 3.
 
Linear regression formula and determination coefficient for each environment map. x represents the mean of common logarithm of luminance (cd/m2), and the objective variable is the common logarithm of matched reflectance.
Table 4.
 
Stimulus parameters of target images.
Table 4.
 
Stimulus parameters of target images.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×