Many previous studies have focused on the glossiness perceived from a single stimulus image (Anderson & Kim,
2009; Beck & Prazdny,
1981; Berzhanskaya, Swaminathan, Beck, & Mingolla,
2005; Ferwerda, Pellacini, & Greenberg,
2001; Fleming, Dror, & Adelson,
2003; Kim & Anderson,
2010; Motoyoshi, Nishida, Sharan, & Adelson,
2007; Nagata, Okajima, & Osumi,
2007; Pellacini, Ferwerda, & Greenberg,
2000). However, relying only on a single-image-based cue (i.e., monocular static cue) to glossiness can lead to misestimation of glossiness. For instance, Motoyoshi et al. (
2007) claimed that for glossiness perception, the human visual system may exploit the skewness of the luminance histogram, which is a simple statistical measure derived from a single image. However, such a statistical measure does not always correlate with the actual surface reflectance properties (Anderson & Kim,
2009). Similarly, although it is well known that a specular highlight is an important cue to glossiness (Beck & Prazdny,
1981; Berzhanskaya et al.,
2005), a matte surface with a highlight-like texture in a single image appears glossy (Hartung & Kersten,
2002). Therefore, a cue in a single image produces a wrong estimate of glossiness. Nevertheless, we do not usually face such problems in our daily lives. Then, how does the human visual system overcome such misestimation problems?