Let us begin by summarizing some prior research on the perception of shape from texture. There are several different properties of optical texture patterns that have been identified as possible sources of information for the perception of local slant. One popular approach for estimating slant is based on an assumption that variations in reflectance on a surface are statistically isotropic. In the special case of polka dot textures, as shown in
Figure 1, the optical slant (
τ) at the center of each element can be determined by the following equation:
where
λ and
ω are the major and minor axes of its optical projection. This is often referred to as the foreshortening cue. Similar computations can also be performed for less regular isotropic textures from the distribution of edge orientations in each local image region (Aloimonos,
1988; Blake & Marinos,
1990; Blostein & Ahuja,
1989; Marinos & Blake,
1990; Witkin,
1981), or from the relative anisotropy of their local amplitude spectra (Bajcsy & Lieberman,
1976; Brown & Shvayster,
1990; Krumm & Shafer,
1992; Sakai & Finkel,
1995; Super & Bovik,
1995). One important weakness of this cue relative to others is that it cannot reveal the sign of slant—only its magnitude.
An alternative approach that does not share this weakness is to estimate surface slant by measuring the changes of optical texture across different local neighborhoods of an image, based on an assumption that the texture on a physical surface is statistically homogeneous. As was first demonstrated by Purdy (
1958), the optical slant (
τ) in a given local region can be determined by the following equation:
where
δ is the projected distance between neighboring optical texture elements in the direction that slant is being estimated, and
λ1 and
λ2 are the projected lengths of those texture elements in a perpendicular direction (see
Figure 1). In the limit of an infinitesimally small
δ, the right side of
Equation 2 is equal to the normalized depth gradient (Purdy,
1958, Equation 14; Gårding,
1992, Equation 33). This is sometimes referred to as the scaling cue. Similar computations can also be performed on less regular textures from the affine correlations between the amplitude spectra in neighboring image regions (Clerc & Mallat,
2002; Malik & Rosenholtz,
1994,
1997) or from systematic changes in the distributions of edges (Gårding,
1992,
1993).
An extensive series of experiments and simulations was performed by Knill (
1998a,
1998b) in an effort to determine the relative importance of these different possible texture cues for slant discrimination judgments. For example, one technique he employed involved manipulating the relative reliability of the cues by adding random variations to some local texture properties but not others. From the results of these studies, Knill concluded that observers' slant estimates are based primarily on the foreshortening cue. He was also the first to discover that discrimination thresholds for shallow slants are an order of magnitude larger than those obtained for steep slants. This finding has proven to be important for subsequent research on cue integration because it suggests that texture should be weighted more heavily for steep slants than for shallow slants in relation to other cues.
A more recent series of studies by Todd, Thaler, and Dijkstra (
2005) and Todd, Thaler, Dijkstra, Koenderink, and Kappers (
2007) has produced a contradictory pattern of results. They used adjustment tasks in which observers were asked to duplicate the apparent variations in depth on a surface. The results revealed that the ability to distinguish slants from texture requires relatively large viewing angles, which provides strong evidence that observers' perceptions cannot be based on computational analyses within small local neighborhoods. In light of this finding, Todd et al. (
2007) proposed a new source of information, called scaling contrast, which is defined by the following equation:
where
λmax and
λmin are the lengths of the largest and smallest texture elements over the entire extent of a visible surface. This measure is similar to
Equation 3, but it is designed to evaluate the variations in scaling over large regions of visual space, rather than small local neighborhoods. Todd et al. (
2007) found that it is highly correlated with observers' shape judgments over a wide range of conditions.
1 Another interesting finding from these studies is that the variance in observers' settings did not change dramatically with slant, as has been reported by Knill (
1998a,
1998b) and others for slant discrimination thresholds. If anything, the changes in variance as a function of slant were in the opposite direction.