The psychophysical paradigm of “luminance contrast masking” (LCM) was developed to probe the visual representation of luminance contrast, orientation, and spatial frequency of static visual patterns (Foley,
1994; Itti, Koch, & Braun,
2000; Lee, Itti, Koch, & Braun,
1999; Legge & Foley,
1980; Wilson,
1980). Results from this approach agree quantitatively with the dependence of responses of cortical neurons on luminance contrast, orientation, and spatial frequency (Geisler & Albrecht,
1997; Itti et al.,
2000). In addition, LCM can uncover how visual representations are altered by attention (Carrasco, Penpeci-Talgar, & Eckstein,
2000; Freeman, Sagi, & Driver,
2001; Lee, Itti, et al.,
1999; Morrone, Denti, & Spinelli,
2002; Zenger, Braun, & Koch,
2000). For example, LCM reveals that attention intensifies competitive interactions among visual filters, resulting in a higher effective gain and a sharper effective tuning for static visual patterns (Braun, Koch, Lee, & Itti,
2001; Lee, Itti, et al.,
1999).
Here, we ask whether LCM manifests comparable attention effects for dynamic visual patterns. Traditionally, attention is thought to interact, though little, with the perception of visual motion. Manipulations of attention with cueing and visual search paradigms typically produce little or no effect on the perception of visual motion (Raymond,
2000). However, more recent psychophysical work (Chaudhuri,
1990; Raymond, O'Donnell, & Tipper,
1998) as well as neuroimaging (Gandhi, Heeger, & Boynton,
1999; Huk & Heeger,
2000; Saenz, Buracas, & Boynton,
2002; Watanabe et al.,
1998) and neurophysiological studies (Martinez-Trujillo & Treue,
2002; Seidemann & Newsome,
1999; Treue & Maunsell,
1996) have established robust attention effects on neural responses to visual motion. We attempt to quantify attention effects on the perception of visual motion with the help of LCM.
An obstacle to achieving this goal is that visual motion is represented at multiple levels in the visual system. Particularly relevant here are representations of “component” and “pattern” motion (Adelson & Movshon,
1982; Simoncelli & Heeger,
1998; Welch,
1989; Wilson & Kim,
1994). Visual filters tuned to a particular spatiotemporal frequency are inherently ambiguous about the true direction and speed of motion (component motion; Adelson & Bergen,
1985). A wide range of spatiotemporal frequencies must be compared to identify the veridical motion vector (pattern motion; Adelson & Movshon,
1982; Welch,
1989). Whereas several areas of visual cortex, including area V1, are tuned to component motion, selectivity for pattern motion appears concentrated in the middle temporal cortex (area MT or V5; Huk & Heeger,
2002; Movshon, Adelson, Gizzi, & Newsome,
1985). The neural circuits underlying this transformation are under active study (Heuer & Britten,
2002; Movshon & Newsome,
1996).
The distinct representations of component and pattern motion were first studied with displays (“moving plaids”) that superimpose two moving gratings (Adelson & Movshon,
1982). However, intersections between gratings are perceptually conspicuous and complicate the interpretation of results (Stoner & Albright,
1992; Stoner, Albright, & Ramachandran,
1990; Wilson & Kim,
1994). Schrater, Knill, and Simoncelli (
2000) filtered dynamic noise to distribute motion energy in a manner comparable to moving plaids but without introducing conspicuous features. Adopting a similar approach, we combined discrete “wavelets” of spatiotemporal luminance variation to create dynamic textures of spatially uniform appearance.
To distinguish the respective contributions of component and pattern mechanisms to the perception of visual motion, we took advantage of known properties of pattern-selective neurons in middle temporal area MT. The response of such neurons to a preferred motion is reduced and, in some cases, even suppressed by the simultaneous presence of motion in the opposite direction. This nonlinear interaction between different motion components is known as “motion opponency” (Heeger, Boynton, Demb, Seidemann, & Newsome,
1999; Qian & Andersen,
1994; Snowden, Treue, Erickson, & Andersen,
1991). In fact, area MT as a whole responds only minimally to multiple motion components in random directions (Britten, Shadlen, Newsome, & Movshon,
1993; Rees, Friston, & Koch,
2000). Presumably, the response to any one component is inhibited by the simultaneous presence of the other components (Simoncelli & Heeger,
1998). Accordingly, dynamic patterns containing all directions of motion should drive component mechanisms far better than pattern mechanisms.
We conducted LCM experiments with just such a stimulus to probe the visual representation of component motion. Our results confirmed and extended several earlier studies on motion masking (Anderson & Burr,
1985,
1989; Anderson, Burr, & Morrone,
1991; Ferrera & Wilson,
1987; Lu & Sperling,
1995,
1996). To measure the effect of attention, we used an established dual-task paradigm (Braun,
1994,
1998; Braun & Julesz,
1998; Lee, Itti, et al.,
1999; Lee, Koch, & Braun,
1997,
1999; Li, VanRullen, Koch, & Perona,
2002; Zenger et al.,
2000) and compared LCM thresholds for moving patterns that are either fully or poorly attended.