The visual system's representation of objects includes percepts that correlate with object surface reflectance. In general, these include color as well as perceptual correlates of object material properties, such as glossiness (Maloney & Brainard,
2010). The retinal image, however, does not provide an explicit representation of object reflectance. Rather, image intensities depend both on object reflectance and the properties of the illumination. To produce stable perceptual representations of object surface reflectance, the visual system must process the retinal image to minimize effects of variation in the illumination.
Figure 1 shows an achromatic image where there is large spatial variation in the illumination.
A number of theorists have postulated that the stabilization of object appearance occurs in two stages (Adelson,
2000; Gilchrist,
1977; Gilchrist et al.,
1999; Kardos,
1934; Koffka,
1935). The first stage segments the image into regions that each have roughly constant illumination. The second stage then, in effect, estimates the illuminant within each region and uses the estimate in its conversion between luminance and lightness for that region.
What information could the visual system use to segment the image according to illumination? Photometric cues provide one source of information that can indicate illumination changes. Surface albedo is typically thought to vary over about a 30 to 1 range in natural scenes (see, for example, reflectance data summarized in Wyszecki & Stiles,
1982). Thus, if two grayscale image regions vary in luminance by a factor much larger than 30, they are unlikely to share a common illuminant. In the image shown in
Figure 1, it is easy to imagine that such a difference in image intensity helps mediate the impression that the floor is lit by two distinct illuminants.
On the other hand, a number of geometric factors may also correlate with illumination changes. One, for example, is distance: The further apart two surface patches are in a scene, the less likely it seems that they will share a common illuminant. Accordingly, experiments have found a decreasing influence of contextual surfaces on target surface appearance with increasing distance (Kurki, Peromaa, Hyvärinen, & Saarinen,
2009; Reid & Shapley,
1988; Shimozaki, Eckstein, & Abbey,
2005; Spehar, Debonet, & Zaidi,
1996). Closely related is the idea that coplanar surfaces are more likely to share a common illuminant than surfaces oriented differently within a scene (Gilchrist,
1980). Various cues are available to indicate surface orientation in a scene (e.g., binocular disparity), as well as changes in orientation of groups of surfaces (e.g., Ψ-junctions, Sinha & Adelson,
1993). Finally, the luminance relations across certain geometric configurations may signal illumination boundaries (e.g., X- and T-junctions, Todorović,
1997).
Despite the centrality of segmentation in theories of lightness, little is known about how well observers can use the type of photometric information induced by changes of illumination to segregate scenes. For achromatic images, changing the illumination changes the statistical distribution of the luminances reaching the observer, because the luminance distribution arises as the product of the illuminant intensity and the underlying distribution of surface albedos. In the present paper, then, we step back from the specifics of illuminant-based segmentation and ask the more basic question of how well observers can detect within-image changes in the distribution of image luminances. That is, we sought to study fundamental aspects of this ability, using simple stimuli that did not evoke percepts of illuminated surfaces. We used checkerboard stimuli and asked observers to judge which of two images contained a region where the luminance statistics differed from those in the rest of the scene. We also compared the data to predictions from an ideal observer model. Finally, we asked whether manipulating the geometric structure of the images affected performance on our segregation task. The measurements provide baseline information that can be exploited in future experiments that study illumination segmentation in more complex scenes to determine the role of additional information sources.