Despite the complexity of resolving the shadow correspondence-problem for the perception of cast shadows, the visual system performs with relative efficiency (Mamassian & Goutcher,
2001). Rather than being ignored, cast shadows have been shown to contribute substantially to determining the 3D spatial layout of the visual scene (see Dee & Santos,
2011). For example, Hubona, Wheeler, Shirah, and Brandt (
1998) demonstrated that cast shadows greatly influence the perceived depth of stereoscopically defined objects. Kersten, Knill, Mamassian, and Bulthoff (
1996) demonstrated that the relative position of the cast shadow can govern the perceived trajectory of a moving object and that a moving cast shadow can elicit illusory motion in a stationary object. Moreover, cast shadows can be used as an effective cue to aid visual search and the segregation of local visual information (e.g., Cunningham, Beck, & Mingolla,
1996; Lovell, Gilchrist, Tolhurst, & Troscianko,
2009; Rensink & Cavanagh,
2004), and aids in the disambiguation of object and surface shape (see Cavanagh & Leclerc,
1989; Madison, Thompson, Kersten, Shirley & Smits,
2001), which has been shown to contribute to the recognition of objects (Braje, Legge, & Kersten,
2000; Tarr, Kersten, & Bulthoff,
1998). Recently, Mamassian (
2004) has suggested that the visual system solves the shadow correspondence problem by implementing a coarse scale analysis. That is, the visual system is largely insensitive to local differences in the structural congruence (e.g., conforming to a particular lighting direction) between the shadow and the casting object, but instead image characteristics such as their “center of mass” is used as a basis for matching. Mamassian (
2004) noted that cast shadow percepts are evident even when local shadow-object matches represent different lighting directions. These ‘impossible shadows' are commonly observed in art and indicate that the visual system emphasizes
global rather than
local image properties in the shadow matching process (see Casati,
2008; Cavanagh,
2005). This coarse analysis is perhaps optimal as it provides quick means of discerning the 3D structure of the visual scene and ignoring fine detail that may hinder the detection process. However, this process is not well understood and remains the focus of much research.