The fact that we typically make several saccadic eye movements every second means that the position of objects on the retina is constantly changing. Thus, one of the fundamental questions of vision science is how we keep track of the location of objects across saccades (for review, see Bays & Husain,
2007; Melcher & Colby,
2008; Wurtz,
2008). However, a perhaps more basic question is how our naive perception of a smooth and continuous visual flow is built out of a series of relatively brief visual snapshots that are separated by abrupt jumps, like in a poorly filmed home movie. This problem is made even more clear by the fact that the new input to the eyes in each fixation must travel through the visual system before it reaches awareness, necessitating around 120–200 ms (Genetti, Khateb, Heinzer, Michel, & Pegna,
2009; Liu, Agam, Madsen, & Kreiman,
2009; Thorpe, Fize, & Marlot,
1996), and that visual input is partially suppressed while a saccade is performed (Burr, Morrone, & Ross,
1994). The issue of achieving stable perception based on discrete and discontinuous input is particularly troublesome in the case of visual motion. While the brain is extremely efficient in integrating motion cues over time and space over a period of seconds (Burr & Santoro,
2001; Neri, Morrone, & Burr,
1998), motion detectors are typically assumed to operate in retinal coordinates (although see Ong, Hooshvar, Zhang, & Bisley,
2009). Unless motion for the same object is integrated across saccades (Melcher & Morrone,
2003), then this impressive ability to integrate motion over time would be essentially useless.