The basic idea about motion perception is that in the early stages of visual processing, motion is extracted by localized motion sensors tuned to spatial frequency, temporal frequency, and orientation, which work in parallel (Levinson & Sekuler,
1975; Adelson & Movshon,
1982; Anderson & Burr,
1987,
1989,
1991; Anderson, Burr, & Morrone,
1991). Classical energy models of human visual motion sensing have successfully implemented this basic structure to explain different motion phenomena like apparent motion, the missing-fundamental illusion, reverse Phi, etc., (Adelson & Bergen,
1985; Watson & Ahumada,
1985; van Santen & Sperling,
1985). However, there is a lot of psychophysical evidence suggesting the existence of an inhibitory mechanism that produces an interaction between motion sensors tuned to different scales in later stages of motion processing. In this line, human observers have been reported to make systematic errors in motion direction discrimination of very briefly presented motion stimuli containing features designed to activate motion sensors tuned to high and low spatial frequencies (Derrington & Henning,
1987; Henning & Derrington,
1988; Derrington, Fine, & Henning,
1993; Nishida, Yanagi, & Sato,
1995; Serrano-Pedraza et al.,
2007; Serrano-Pedraza & Derrington,
2010; Serrano-Pedraza, Gamonoso-Cruz, Sierra-Vazquel, & Derrington,
2013; see also the “Interaction across different spatial scales” section in Nishida,
2011). In particular, at short durations, when a moving high spatial frequency pattern is added to a static low spatial frequency pattern, humans make systematic motion discrimination errors (Derrington & Henning,
1987).