“Shape from shading” is a term that derives from algorithmic or machine vision (Forsyth & Ponce,
2002; Horn & Brooks,
1989; van Diggelen,
1959; Zhang, Tsai, Creyer, & Shah,
1999). It refers to a class of algorithms that computes three-dimensional shape on the basis of image structure. Of course, this involves numerous assumptions, some generic, some limiting, some incoherent (i.e., in conflict with physics). Typical assumptions include uniform, homogeneous light fields, Lambertian surfaces of uniform albedo, and the absence of both vignetting and multiple scattering. Light fields are often confined to a collimated beam, such as direct sunlight. Lambertian surfaces (Lambert,
1760) do not exist in the strict sense, though white plaster or paper comes close. Vignetting (Forsyth & Zisserman,
1991; Koenderink & van Doorn,
1983,
2003b) is the result of occlusion of the source by the scene itself. Cast and body shadows are the simplest examples. More complicated, and often important, cases involve extended sources and non-convex objects. Similar constraints apply to multiple scattering, which depends upon the possibility of disjunct surface elements to “see each other.” In order to get rid of vignetting and multiple scattering, one would have to limit the scene to a single convex object—say an egg in outer space—and consider only the illuminated side. Of course, this is rarely of much interest. More realistic instances include shallow reliefs; in such cases, both vignetting and multiple scattering become negligible. Many scenes contain areas of this type because smoothly curved surfaces approximate flattish relief when considered in sufficiently small areas. Given these assumptions, the luminance sampled by the camera will reflect the illuminance of the surfaces in the scene (though the factor of proportionality is undefined), and one may invoke Lambert's cosine law (Lambert,
1760) to make the connection with scene geometry. The angle at which the beam strikes a surface determines its illuminance, thus the sampled luminance variations reflect surface attitude variations with respect to the direction of the illuminating beam. Although the shading is insufficient to specify scene geometry fully in this way, one at least finds solutions
modulo a group of ambiguities (Belhumeur, Kriegman, & Yuille,
1999). These ambiguities include absolute distance, overall spatial attitude (“additive plane”), and depth of relief (“bas-relief ambiguity”; Belhumeur et al.,
1999). Such ambiguities also occur in human perception (Brewster,
1832; Hill & Bruce,
1994; Kleffner & Ramachandran,
1992; Ramachandran,
1988a,
1988b; Rittenhouse,
1786).