Although studies of vision and graphics often assume simple illumination models, real-world illumination is highly complex, with reflected light incident on a surface from almost every direction. One can capture the illumination from every direction at one point photographically using a spherical illumination map. This work illustrates, through analysis of photographically acquired, high dynamic range illumination maps, that real-world illumination possesses a high degree of statistical regularity. The marginal and joint wavelet coefficient distributions and harmonic spectra of illumination maps resemble those documented in the natural image statistics literature. However, illumination maps differ from typical photographs in that illumination maps are statistically nonstationary and may contain localized light sources that dominate their power spectra. Our work provides a foundation for statistical models of real-world illumination, thereby facilitating the understanding of human material perception, the design of robust computer vision systems, and the rendering of realistic computer graphics imagery.

*k*= 2/

*π*, where

*k*is the ratio of the radius of the cylinder to the radius of the sphere. In particular, an infinitesimal patch on the sphere at latitude

*ϑ*will find itself expanded by a factor of

*k*/cos

*ϑ*in the horizontal direction and reduced by a factor of cos

*ϑ*in the vertical direction. Because the product of these two factors is a constant

*k*, this projection preserves areas, even though it heavily distorts angles near the poles.

*SD σ*= 1.04, kurtosis1

*k*= 4.04, and differential entropy2

*H*= 2.06 bits. The Debevec image set has

*σ*= 1.32,

*k*= 12.49, and

*H*= 2.21 bits. Huang and Mumford found

*σ*= 0.79,

*k*= 4.56, and

*H*= 1.66 bits. The kurtosis values are influenced heavily by individual outliers. The

*SD*s and entropies of the distributions are higher for our data sets than for those of traditional photographs, due to the higher dynamic range and the presence of concentrated illumination sources.

*SD*, kurtosis, and differential entropy of log luminance values for individual images in each data set. Kurtosis varies more from one image to another than

*SD*and differential entropy.

Teller Images | Debevec Images | ||||||
---|---|---|---|---|---|---|---|

σ | k | H | σ | k | H | ||

Mean | 1.02 | 5.15 | 1.64 | Mean | 1.27 | 8.83 | 1.90 |

SD | 0.21 | 4.20 | 0.33 | SD | 0.39 | 6.82 | 0.39 |

Min | 0.57 | 1.69 | 0.80 | Min | 0.73 | 2.26 | 1.30 |

Max | 1.81 | 19.88 | 2.43 | Max | 1.82 | 21.46 | 2.44 |

*p*

_{1}and

*p*

_{2}are highly correlated. Much of the mass of the joint histogram concentrates near the diagonal where

*p*

_{1}=

*p*

_{2}. In agreement with Huang and Mumford (1999), we found that

*p*

_{1}+

*p*

_{2}and

*p*

_{1}−

*p*

_{2}are more nearly independent than

*p*

_{1}and

*p*

_{2}. In particular, the mutual information3 of

*p*

_{1}and

*p*

_{2}is 2.41 bits for the Teller images and 3.25 bits for the Debevec images, whereas that of

*p*

_{1}+

*p*

_{2}and

*p*

_{1}−

*p*

_{2}is only 0.10 bits for the Teller images and 0.07 bits for the Debevec images. Hence, the percentage difference between the luminance incident from two horizontally adjacent spatial directions is roughly independent of the mean luminance from those two directions.

*f*

^{2+η}, where

*f*represents the modulus of the frequency and

*η*is a small constant that varies from scene to scene. A power spectrum of this form is characteristic of self-similar image structure. If one zooms in on one part of the image, the power spectrum will typically change only by an overall scaling factor.

*L*, a non-negative integer analogous to frequency. The 2

*L*+ 1 spherical harmonics of order

*L*span a space that is closed under rotation (Inui, Tanabe, & Onodera, 1996).

*L*will fall off as 1/

*L*

^{2+η}.

*L*as the mean of squares of the coefficients at that order. Teller’s data lack information about the lowest portion of the illumination hemisphere. We applied a smooth spatial window to these illumination maps before transforming them to the spherical harmonic domain.

*k/L*

^{2}. We fit a straight line on log-log axes to the power spectrum of each image in the Teller data set. The best-fit lines had slopes ranging from −1.88 to −2.62, with a mean of −2.29. All 95 regressions gave R-square values of at least 0.95, with 86 of them above 0.97 and a mean R-square value of 0.98, indicating excellent fits. When we fixed the slope to −2 in all regressions, we also found good fits, with a minimum R-square value of 0.93 and a mean of 0.96. Fixing the slope to −2.29 gave a minimum R-square value of 0.91 and a mean of 0.98.

*k/L*

^{2+η}with

*η*small. On the other hand, illumination maps that contain intense, localized light sources have smooth power spectra that remain flat at low frequencies before falling off at higher frequencies. The illuminations of Figure 7(c) and 7(d) both display this behavior; the power spectrum of a linear luminance version of Figure 7(c) is shown in Figure 13. In these images, one or a few luminous sources, such as the sun or incandescent lights, dominate the power spectrum. Because these light sources approximate point sources, their spectra are flat at low frequencies. If one clips the brightest pixel values in these images, the power spectra return to the familiar

*k/L*

^{2+η}form (Figure 13).

*f*

^{2+η}power spectra whether pixel values are linear or logarithmic in luminance (Ruderman, 1994). These results on linear luminance images differ from ours because most previous researchers have avoided photographs of point-like luminous sources and have used cameras of limited dynamic range, such that a few maximum intensity pixels could not dominate the image power spectra. A natural illumination map, on the other hand, may be dominated by light sources occupying a small spatial area. Once the relative strength of such sources is reduced through clipping or a logarithmic transformation, illumination maps have power spectra similar to those of typical photographs.

*P*(

*x*)∝ exp(−‖

*x/s*‖

^{a}) accurately model the wavelet coefficient distributions of typical photographs and of ensembles of photographs (Buccigrossi & Simoncelli, 1999; Huang & Mumford, 1999). Panels (a) and (b) of Figure 15 show maximum likelihood fits of this form to the ensemble histogram of wavelet coefficients from the Teller images. The fits are reasonably accurate, although they tend to underestimate the actual distribution for high wavelet coefficient magnitudes. We observed similar behavior for fits to empirical wavelet coefficient distributions for individual illumination maps. This discrepancy from results reported in the natural image statistics literature may be due to the higher dynamic range of the illumination maps we analyzed.

*I*(x) be identical to those computed on normalized, rescaled versions of the images

*β*(

^{ν}I*β*x), where the exponent

*ν*is independent of the scale

*β*(Ruderman, 1994). An exponent

*ν*= 0 leads to two-dimensional power spectra of the form 1/

*f*

^{2}, where

*f*is the modulus of frequency. More generally, a nonzero exponent

*ν*leads to power spectra of the form 1/

*f*

^{2−ν}. For a scale-invariant image ensemble, the variance of wavelet coefficient distributions will follow a geometric sequence at successively coarser scales. If the wavelet basis is normalized such that wavelets at different scales have constant power, as measured by the

*L*

^{2}norm, then the variance will increase by a factor of 2

^{2+ν}at successively coarser scales. If we increase the amplitude of the basis functions by a factor of 2 at each coarser scale, then the variance of the coefficients will increase by a factor of only 2

^{ν}at successively coarser scales. Panels (c) and (d) of Figure 15 illustrate the results of such rescaling. Because

*ν*is small, the distributions change little from one scale to the next. Note that linear-luminance illumination maps are not strictly scale invariant, as evidenced by the fact that their power spectra often deviate significantly from the 1/

*f*

^{2−ν}form. The distributions of wavelet coefficients at successive scales suggest, however, that illumination maps do possess scale-invariant properties apart from the contributions of bright localized light sources.

*k*= 2/

*π*in the equal-area projection. We confirmed that the coefficient distributions for both vertically and horizontally oriented filters at successive scales are similar to those observed for spherical wavelets in Figure 15.

- Illumination maps have a much greater angular extent than typical photographs.
- Photographs are typically taken in a nearly horizontal direction, matching the experience of human vision. Illumination maps are omnidirectional, with most power typically incident from above. Illumination maps often include primary light sources, such as the sun; photographs tend to avoid these.
- Illumination maps have an intrinsic sense of orientation, which photographs may or may not share.
- Illumination maps generally have a much higher dynamic range than typical photographs.
- Illumination maps are linear in luminance, whereas most photographic devices compress the luminance range in a nonlinear and often uncharacterized fashion.

*f*

^{2}power spectrum; although the power spectrum resembles that of natural illumination, the resulting sphere does not look realistic at all.4 The illumination map of (c) was synthesized to have a pixel intensity distribution and marginal wavelet coefficient distributions identical to those of (a), using the texture synthesis technique of Heeger and Bergen (1995). This sphere looks much more realistic, and human observers are able to recognize that its reflectance properties are similar to those of the sphere in (a) (Fleming, Dror, et al., 2003). Finally, the illumination map of (d) was created using the texture synthesis technique of Portilla and Simoncelli (2001), which ensures that not only its pixel intensity distribution and marginal wavelet coefficient distributions but also certain properties of its joint wavelet coefficient distributions match those of (a). This synthetic illumination map captures the presence of edges in the real illumination map, leading to a sphere whose apparent reflectance properties are even more similar to that of (a). This suggests that the statistical properties of natural illumination described in this chapter play an important role in reflectance estimation by the human visual system (as discussed in Fleming, Dror, et al., 2003). It also suggests that one may be able to produce realistic renderings using properly synthesized illumination.

*f*

^{2+η}model, violating scale invariance. Illumination maps display nonstationary statistical properties, such as different distributions of illumination intensity at different elevations. Typical photographs may also lack stationarity, but their nonstationary properties have received little attention in the literature. Wavelet coefficient distributions are fairly regular from one illumination map to another, but fits to generalized Laplacian distributions are less tight than those previously observed for more typical photographs (Buccigrossi & Simoncelli, 1999; Huang & Mumford, 1999).

^{2}The differential entropy

*H*of

*X*is defined as Differential entropy is a measure of information content for a continuous random variable. The distribution with variance

*σ*

^{2}that maximizes differential entropy is the Gaussian, which has differential entropy bits. On the other hand, a distribution that concentrates all probability density near a few discrete points could have an arbitrarily negative differential entropy.

^{4}The illumination map of Figure 18(b) was synthesized in the spherical harmonic domain. The maps of (c) and (d) were synthesized in a rectangular domain corresponding to an equal-area cylindrical projection of the sphere. In (c) and (d), we performed principle component analysis in color space to produce three decorrelated color channels, each of which is a linear combination of the red, green, and blue channels. We then synthesized textures independently in each channel of this remapped color space, as suggested by Heeger and Bergen (1995). Unfortunately, the nonlinear dependencies between the decorrelated color channels are much more severe for high dynamic range illumination maps than for the 8-bit images common in the texture analysis literature. To reduce artifacts associated with these dependencies, we passed the original illumination maps through a compressive nonlinearity on luminance before wavelet analysis, and then applied the inverse nonlinearity to the synthesized illumination maps. The compressive nonlinearity leads to a less heavy-tailed distribution of pixel intensities.

^{th}Street, New York, NY 10036.

*Computational models of visual processing*. Cambridge, MA: MIT Press.

*International Journal for Numerical Methods in Engineering*, 52, 239–271. [CrossRef]

*IEEE Transactions on Image Processing*, 8, 1688–1701. [CrossRef] [PubMed]

*The world in perspective: A directory of world map projections*. New York: John Wiley & Sons.

*Proceedings of SIGGRAPH*, 1998, 189–198.

*Proceedings of SIGGRAPH*, 2000, 145–156.

*Proceedings of SIGGRAPH*, 1997, 369–378.

*Surface reflectance recognition and real-world illumination statistics*(AI Lab Technical Report, AITR-2002-009). Cambridge, MA: MIT Artificial Intelligence Laboratory. Article]

*Proceedings of IEEE Workshop on Statistical and Computational Theories of Vision, Vancouver, Canada*.

*Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii*.

*International Conference on Image Processing, Rochester, NY*.

*Journal of the Optical Society of America A*, 4, 2379–2394. [PubMed] [CrossRef]

*Journal of Vision*, 3(5), 347–368, http://journalofvision.org/3/5/3/, doi:10.1167/3.5.3. [PubMed][Article] [CrossRef] [PubMed]

*Journal of Vision*, 3(9), 73a, http://journalofvision.org/3/9/73/, doi:10.1167/3.9.73.

*Journal of Vision*, 3(9), 59a, http://journalofvision.org/3/9/59/, doi:10.1167/3.9.59. [CrossRef]

*Proceedings of SIGGRAPH*, 1997, 229–238.

*Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition*, 1, 541–547.

*Technology Review*, 107, 70–73.

*Group theory and its applications in physics*. Springer: New York, Berlin, and Heidelberg.

*Proceedings of the International Conference on Acoustics, Speech, and Signal Processing*1980, 291–294.

*Annual Review of Psychology*, 55, 271–304. [PubMed] [CrossRef] [PubMed]

*Pattern Analysis and Machine Intelligence*, 25, 57–74. [CrossRef]

*Z. Naturforsch*., 36c, 910–912.

*Sixteenth Annual Conference on Neural Information Processing Systems, Vancouver, Canada*.

*Proceedings of SIGGRAPH*, 1995, 39–46.

*Proceedings of SIGGRAPH*, 2003, 376–381. [CrossRef]

*American Scientist*, 88, 238–245.

*International Journal of Computer Vision*, 40, 49–71. [CrossRef]

*Proceedings of the International Conference on Image Processing*, Thessaloniki, Greece.

*Proceedings of SIGGRAPH*, 2001, 159–170.

*Network-Computation in Neural Systems*, 5, 517–548. [CrossRef]

*Proceedings of SIGGRAPH*, 1995, 161–172.

*Nature: Neuroscience*, 4, 819–825. [PubMed] [CrossRef] [PubMed]

*Proceedings SPIE, 44th Annual Meeting, Denver, CO*.

*Subband Image Coding*(pp. 143–192). Norwell, MA: Kluwer Academic Publishers.

*Proceedings of the International Conference on Image Processing, Lausanne, Switzerland*.

*Annual Review of Neuroscience*, 24, 1193–1216. [PubMed] [CrossRef] [PubMed]

*Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii*.

*Ophthalmology and Physiological Optics*, 12, 229–232. [PubMed] [CrossRef]

*Contextual priming for object detection*(Memo 2001-020). Cambridge, MA: MIT Artificial Intelligence Laboratory. Link]

*Proceedings of the Royal Society of London B*, 265, 359–366. [PubMed] [CrossRef]

*Applied and Computational Harmonic Analysis*, 11, 89–123. [CrossRef]