Free
Research Article  |   December 2008
Vector subtraction using visual and extraretinal motion signals: A new look at efference copy and corollary discharge theories
Author Affiliations
Journal of Vision December 2008, Vol.8, 24. doi:10.1167/8.14.24
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      John A. Perrone, Richard J. Krauzlis; Vector subtraction using visual and extraretinal motion signals: A new look at efference copy and corollary discharge theories. Journal of Vision 2008;8(14):24. doi: 10.1167/8.14.24.

      Download citation file:


      © 2016 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements
Abstract

The question as to how the visual motion generated during eye movements can be ‘canceled’ to prevent an apparent displacement of the external world has a long history. The most popular theories (R. W. Sperry, 1950; E. von Holst & H. Mittelstaedt, 1950) lack specifics concerning the neural mechanisms involved and their loci. Here we demonstrate that a form of vector subtraction can be implemented in a biologically plausible way using cosine distributions of activity from visual motion sensors and from an extraretinal source such as a pursuit signal. We show that the net result of applying an ‘efference copy/corollary discharge signal’ in the form of a cosine distribution is a motion signal that is equivalent to that produced by vector subtraction. This vector operation provides a means of ‘canceling’ the effect of eye movements. It enables the extraretinal generated image motion to be correctly removed from the combined retinal–extraretinal motion, even in cases where the two motions do not share the same direction. In contrast to the established theories (efference copy and corollary discharge), our new model makes specific testable predictions concerning the location (the MT–MST/VIP areas) and nature of the eye-rotation cancellation stage (neural-based vector subtraction).

Introduction
Locomotion and visual navigation are essential behaviors for all biological species, but only some animals (including humans) rotate their eyes while moving through the world. Moving the eyes during locomotion provides an evolutionary advantage because threats approaching from the side are more easily detected. Smooth pursuit eye movements also help maximize spatial acuity by minimizing the retinal velocity of objects centered on the fovea (Eckert & Buchsbaum, 1993). However, mobile eyes create their own set of problems: The resulting retinal image motion is ambiguous because it could represent movement of the world, movement of the observer, or combinations of both. Despite this ambiguous input, our brains somehow manage to solve this ‘eye rotation problem’ and correctly construct the perception of a stable world. 
The question of how eye movements are distinguished from motion in the world has a long history. The earliest recorded description of the problem can be found in a treatise by Alhazen in 1093 but it was Helmholtz (1925) who encapsulated the problem best when he stated that the image motion caused by eye movements is sensed (i.e., registered on the retina) but not perceived. In the 1950's, two similar theories were put forward as an explanation for how the perception of movement could be ‘blocked’ during eye movements (Sperry, 1950; von Holst & Mittelstaedt, 1950): Sperry suggested that a ‘corollary discharge’ of the motor signal sent to move the eyes is also sent to a higher level ‘comparator’ stage. von Holst and Mittelstaedt proposed a similar idea with an ‘efference copy’ of the signal being sent to the comparator. Both theories (CD/EC) argued that if retinal motion signals from the eye arrive at the comparator stage at the same time as the CD/EC, the signals ‘cancel’ each other and so no motion would be perceived (Figure 1). This theory persists as the most popular ‘textbook’ explanation for the eye-rotation problem (e.g., Goldstein, 2007). The longevity of the CD/EC theory is not surprising because it can explain many aspects of perceptual stability in the presence of eye movements, e.g., if you push on your eyeball (no CD/EC signal) the world appears to move (Goldstein, 2007; van Holst, 1954). The basic principal of cancellation seems valid; it is the fine details of the theory that are missing. 
Figure 1
 
Standard block diagram representing the corollary discharge/efference copy theory. A copy of the motor signal sent to move the eye is also sent to a comparator unit. If the retinal motion signal (bottom of figure) arrives at the same time as the CD/EC signal, cancellation occurs and no motion is perceived.
Figure 1
 
Standard block diagram representing the corollary discharge/efference copy theory. A copy of the motor signal sent to move the eye is also sent to a comparator unit. If the retinal motion signal (bottom of figure) arrives at the same time as the CD/EC signal, cancellation occurs and no motion is perceived.
In its current form, the CD/EC theory has two limitations: (1) It does not provide a detailed mechanism for the rotation ‘cancellation’ stage nor the exact locus and nature of the ‘comparator’. (2) It underestimates the complexity of the problem. Figure 2a shows the retinal slip from an eye rotation made by a stationary observer. The vectors in the bottom part of the figure represent the local retinal image motion. Note that all of the vectors are unidirectional and all have the same length. This is the situation normally addressed by the CD/EC theory. Figure 2b shows the retinal motion for a forward moving observer with a non-rotating eye. The motion is radial in structure and it emanates from a point coinciding with the direction of heading (Gibson, 1950). Figure 2c illustrates what happens to the image motion on the retina when an observer makes an eye rotation at the same time that they are moving forward (a very common scenario). The resulting motion is equal to the vector sum of the motion shown in Figures 2a and 2b (gray arrows). It is much more complex and contains multiple directions and speeds (Koenderink & van Doorn, 1975; Longuet-Higgins & Prazdny, 1980; Nakayama & Loomis, 1974; Regan & Beverley, 1982). Without some form of correction for the eye rotation, the observer will misperceive their direction of heading and/or their actual path of motion through the world (Cutting, Springer, Braren & Johnson, 1992; Regan & Beverley, 1982; Royden, Banks & Crowell, 1992; Stone & Perrone, 1997; van den Berg, 1993; Warren & Hannon, 1988). How does the brain recover the motion pattern shown in Figure 2b from the complex pattern shown in Figure 2c? The brain faces a much harder task in Figure 2c than it does in Figure 2a because multiple image velocities are present and it is not sufficient to simply apply a cancellation signal based on a single velocity vector. 
Figure 2
 
Complexity introduced by motion of the observer. (a) Representation of the retinal image motion caused by a pursuit eye movement to the right while observing static isolated points in the world. The flow field is largely uniform in speed and direction. (b) Image motion generated during forward translation of the observer with no eye movement. The image motion radiates out from a point in the middle of the visual field coinciding with the direction of heading. (c) Image motion generated during simultaneous forward translation and smooth pursuit to the right. The image motion is made up of the vector sum of the vectors in (a) and (b) and contains multiple directions and speeds.
Figure 2
 
Complexity introduced by motion of the observer. (a) Representation of the retinal image motion caused by a pursuit eye movement to the right while observing static isolated points in the world. The flow field is largely uniform in speed and direction. (b) Image motion generated during forward translation of the observer with no eye movement. The image motion radiates out from a point in the middle of the visual field coinciding with the direction of heading. (c) Image motion generated during simultaneous forward translation and smooth pursuit to the right. The image motion is made up of the vector sum of the vectors in (a) and (b) and contains multiple directions and speeds.
There has been a lot of debate as to whether or not humans can actually solve the eye-rotation problem in the context of a moving observer (e.g., Li, Sweet, & Stone, 2006; Li & Warren, 2000; Royden et al., 1992; Stone & Perrone, 1997; van den Berg, 1993; Warren & Hannon, 1988; Wilkie & Wann, 2006). Some initial studies showed that heading could be extracted reasonably accurately from motion flow fields without the aid of an extraretinal signal (Warren & Hannon, 1988), but later studies showed that when the speed of pursuit eye movements was higher, an extraretinal signal is important for correct performance. It is now reasonably well established that an ‘extraretinal’ (eye movement) signal does help us visually navigate and that some cancellation process is at work (Crowell & Andersen, 2001; Freeman, Banks, & Crowell, 2000; Li & Warren, 2000; Royden et al., 1992). The fact that we can navigate safely while making eye movements indicates that at some stage the brain must be compensating for the eye movements. Neural correlates of a form of cancellation have been demonstrated in neurons found in the Medial Superior Temporal (MSTd) area of the primate brain (Bradley, Maxwell, Andersen, Banks, & Shenoy, 1996.; Erikson & Thier, 1991; Inaba, Shinomoto, Yamane, Takemura, & Kawano, 2007; Lee, Pesaran, & Andersen, 2007; Page & Duffy, 1999; Shenoy, Bradley, & Andersen, 1999; Upadhyay, Page, & Duffy, 2000) as well as in the parietal area VIP (Zhang, Heuer, & Britten, 2004). There is also clinical evidence that the ability to compensate for eye movements is lost when cortical lesions are present (Haarmeier, Thier, Repnow, & Petersen, 1997). We will therefore assume that some form of correction occurs but acknowledge that it may not always work perfectly (Crowell & Andersen, 2001; Freeman & Banks, 1998; Freeman et al., 2000; Haarmeier, Bunjes, Lindner, Berret, & Thier, 2001; van den Berg, Beintema, & Frens, 2001). 
Area MST of the primate brain has been shown to be involved in self-motion estimation (Britten & van Wezel, 1998; Duffy & Wurtz, 1991; Saito et al., 1986; Tanaka et al., 1986) and neurons in this area also respond to smooth pursuit eye movements (Inaba et al., 2007; Komatsu & Wurtz, 1988; Newsome, Wurtz, & Komatsu, 1988). Despite this connection, there have been very few attempts to model the eye movement compensation process in the context of human and primate self-motion estimation. This may be because the latter problem is difficult enough on its own without the added complication of eye movements. The recovery of observer motion parameters from retinal image motion is a complex non-linear problem (Koenderink & van Doorn, 1975; Longuet-Higgins & Prazdny, 1980) and very few models of this process exist that could be considered to be physiologically plausible and which map onto the known properties of MSTd neurons (see Perrone & Stone, 1998). The few models that attempt to extract heading and consider eye movements (Beintema & van den Berg, 1998; Lappe, 1998) do not incorporate a realistic ‘front-end’ stage with MT-like motion sensors. The extraction of the local image velocity is assumed to have occurred prior to the self-motion estimation stage and it is difficult to assess how well the models would work with realistic 2-dimensional image sequences with varying levels of contrast and spatial frequency content. Factors such as image contrast and spatial frequency have been shown to have an impact on the eye rotation cancellation process (Freeman & Banks, 1998) and models that leave out the motion processing stages prior to MST are ignoring a key part of the problem. A pursuit model proposed by Pack, Grossberg, and Mingolla (2001) did incorporate an explicit MT stage, but it was only applied to the relatively simple case of leftward or rightward pursuit and only dealt with the case of unidirectional motion. Furman and Gur (2003) developed a neural network model, which used an unsupervised training procedure to produce MST-like units that incorporated a pursuit signal. However they only considered the case of pursuit against a fixed background (of dots) and so the motion they were considering was uniform across the field (as in Figure 2a) and did not include multiple directions and speeds at each image location. Their ‘MST’ units were configured for planar motion only and would not be able to process the expansion type of optical flow patterns that drive many MST neurons (Britten & van Wezel, 1998; Duffy & Wurtz, 1991; Saito et al., 1986; Tanaka et al., 1986). 
The eye rotation problem is obviously closely tied to the problem of how local motion signals are combined across the visual field and how self-motion information is recovered. The new model we present in this paper is motivated by the need for some sort of compensation mechanism that works in the case of multidirectional 2-d image motion distributed across wide areas of the visual field (as in Figure 2c). Our ultimate aim is to integrate the new cancellation mechanism into our previously developed self-motion ‘analyzers’ (Perrone, 1992; Perrone & Stone, 1994, 1998). However, in this paper, we concentrate upon, and mainly deal with, the stages prior to the integration of motion signals across space. Nevertheless, we also show how integration can influence the way the cancellation mechanism is implemented. We address the limitations of the standard CD/EC theory and outline a new neural-based ‘cancellation’ mechanism that the primate visual system could use to remove the effect of eye movement induced motion. 
Vector addition and subtraction
We first describe a general mechanism for carrying out vector addition (or subtraction) using neural signals. This description outlines what could be happening at a ‘local’ level, where only one image location is being considered. We then go on to show how such a mechanism could be equally well applied at later stages of motion processing, after the local motion signals have been integrated across wider areas of the visual field (e.g., in area MSTd or perhaps VIP). 
The problem we are addressing is represented by the vectors in Figure 3a. The T vector (blue) is the retinal image motion that would be present at a particular image location if the observer is translating through the world while not making an eye movement. The R vector (red) represents the image motion produced by a pursuit eye movement to the right and slightly upward while no translation occurs. The net retinal motion that occurs when translation and pursuit occur at the same time is given by the vector sum ( T + R) shown as a black arrow. The visual system experiences the T + R retinal image motion but must recover T in order to correctly estimate the body's self-motion parameters and to correctly recover the relative depth of points in the world. It has access to the value of R (through an extraretinal signal), but how is T obtained from T + R? Standard vector algebra tells us that one simply subtracts R from T + R to find T but how does this vector subtraction occur in the brain? 
Figure 3
 
(a) The eye-rotation problem in vector form. The image motion generated by the eye movement ( R) is vector added to the motion generated by observer translation ( T) to give S. R must be subtracted from S to recover T. (b) Standard vector addition. Each arrow represents an image velocity vector with speed and direction shown in brackets. The vector sum is derived from the projected components of the two vectors onto the X and Y axes (see dashed lines).
Figure 3
 
(a) The eye-rotation problem in vector form. The image motion generated by the eye movement ( R) is vector added to the motion generated by observer translation ( T) to give S. R must be subtracted from S to recover T. (b) Standard vector addition. Each arrow represents an image velocity vector with speed and direction shown in brackets. The vector sum is derived from the projected components of the two vectors onto the X and Y axes (see dashed lines).
For the analysis that follows, we will assume that B = − R, where R is the image motion generated by the eye rotation ( Figure 3a). In order to subtract off the eye movement vector ( R) from another vector ( A), we need to add − R (= B) to it. The standard technique for adding two vectors A and B using vector algebra is shown in Figure 3b. If vector A has speed and direction given by ( V A, θ A) and B = ( V B, θ B), then their sum ( C) is found by projecting A and B onto the X and Y axes and using the following equations to find the magnitude and angle of C:  
V C = V A cos θ A + V B cos θ B cos θ C ,
(1)
 
θ C = tan 1 ( V A sin θ A + V B sin θ B V A cos θ A + V B cos θ B ) .
(2)
Finding the projections using cosine and sine weighting would not be too difficult for a biological system, but the calculation of the arctan function seems more problematic. Cosine tuning is prevalent in many brain areas (Georgopoulos, Schwartz, & Kettner, 1986; Krauzlis & Lisberger, 1996; Wylie, Bischof, & Frost, 1998), but we are not aware of any tuning that maps onto the hyperbolic function required for the calculation of the inverse tangent needed in Equation 2. The problem of singularities associated with the inverse tangent also seems to preclude any simple biological implementation. It turns out that there is another method by which vector addition (or subtraction) can be implemented. We believe this alternative form would be more amenable to a biological system because it simply involves the addition and subtraction of neural activity. Our alternative system is illustrated in Figure 4
Figure 4
 
An alternative form of carrying out vector addition. Each vector is projected onto a series of projection axes, which sample the full 360° range of directions. The projected component of the vector is plotted in Cartesian form on the right. (a) Vector components for A. Three example projections are shown on the left (see dashed lines). The vector component corresponds to the distance from the origin to the point where the dashed line meets the axis. (b) Vector component distribution for B vector. (c) If the distribution of vector components for A and B are summed, the new distribution has a peak amplitude and phase corresponding to the length and direction of the vector sum, A + B (see standard vector plot on the left).
Figure 4
 
An alternative form of carrying out vector addition. Each vector is projected onto a series of projection axes, which sample the full 360° range of directions. The projected component of the vector is plotted in Cartesian form on the right. (a) Vector components for A. Three example projections are shown on the left (see dashed lines). The vector component corresponds to the distance from the origin to the point where the dashed line meets the axis. (b) Vector component distribution for B vector. (c) If the distribution of vector components for A and B are summed, the new distribution has a peak amplitude and phase corresponding to the length and direction of the vector sum, A + B (see standard vector plot on the left).
Instead of projecting the A and B vectors onto just two orthogonal axes ( X and Y) as was done in Figure 3b, each vector is projected onto a series of axes (left-hand side of Figure 4). The axes cover the full 360° range of directions and we will sample this range in 30° steps. Let the angle of each axis be given by ϕ. The projection of A onto this set of axes produces a distribution given by  
D ( A ) = V A cos ( ϕ i θ A ) ,
(3)
where i = 0° to 360° in 30° steps. For the B vector, the distribution is given by  
D ( B ) = V B cos ( ϕ i θ B ) .
(4)
The two distributions are shown in the right-hand part of Figure 4. They are cosine curves with the amplitude corresponding to the length of the vectors and the phase corresponding to their direction. 
It is easy to show that if these two distributions are summed, the resulting distribution has an amplitude and phase corresponding to the length and direction of the vector sum of A and B. The proof is given below. 
The sum of the two vector distributions can be represented as  
D ( A + B ) = V A cos ( ϕ i θ A ) + V B cos ( ϕ i θ B ) .
(5)
When the direction of the vector sum is equal to the direction of the projection axis ( ϕ i), i.e., θ C = ϕ i, we have  
D ( A + B ) = V A cos ( θ C θ A ) + V B cos ( θ C θ B ) .
(6)
Expanding out the cosine terms and rearranging gives  
D ( A + B ) = cos θ C ( V A cos θ A + V B cos θ B ) + sin θ C ( V A sin θ A + V B sin θ B ) ,
(7)
which reduces to  
D ( A + B ) = V A cos θ A + V B cos θ B cos θ C = V C
(8)
(see Equation 1). 
This verifies that the sum of the two distributions has an amplitude corresponding to the magnitude of the vector sum ( A + B), and that this peak amplitude occurs at ϕ i = θ C (the vector sum direction). 
By finding the cosine projections of A and B onto a set of axes ϕ i, and summing the A and B projection values, a new cosine distribution is formed with an amplitude and phase ( V C, θ C) corresponding to the speed and direction of the vector sum of A and B. This means that if we had a cosine distribution corresponding to the sum of the translation vector and an eye rotation vector ( T + R), we could add a cosine distribution based on − R and we would end up with a distribution for T. This would solve the problem shown in Figure 3a. The above proof shows that the sum of the two distributions, D( T + R) + D(− R), will have an amplitude V T and direction θ T
Results
We have established that a form of vector addition can be carried out using cosine distributions. How can we use this technique to remove the effect of eye movements from combined T + R retinal image motion? The first requirement is that the image motion at a particular retinal location ( x, y) is represented in the form of a cosine distribution of activity similar to that shown in Figure 5a
Figure 5
 
Vector subtraction using visual motion and extraretinal signals. (a) Cosine distribution representing the output of a set of velocity sensors located at a particular image location. The overall image motion is in a 30° direction with a speed of 8°/s. The broad tuning of the motion sensors means that other directions are activated as well and we assume that the tuning is cosine. The flower-like inset represents the velocity outputs in polar plot form with the gray lines indicating inhibitory signals. (b) Cosine distribution of activity arising from an extraretinal (pursuit signal) source. The ‘efference copy/corollary discharge’ signal distribution is set to have an amplitude proportional to the speed of the image motion created by the eye movement (6.9°/s) and phase equal to the direction (180°). (c) Sum of the two distributions shown in (a) and (b). It has a peak and phase corresponding to the amplitude and direction of the vector sum of the two motions shown in (a) and (b).
Figure 5
 
Vector subtraction using visual motion and extraretinal signals. (a) Cosine distribution representing the output of a set of velocity sensors located at a particular image location. The overall image motion is in a 30° direction with a speed of 8°/s. The broad tuning of the motion sensors means that other directions are activated as well and we assume that the tuning is cosine. The flower-like inset represents the velocity outputs in polar plot form with the gray lines indicating inhibitory signals. (b) Cosine distribution of activity arising from an extraretinal (pursuit signal) source. The ‘efference copy/corollary discharge’ signal distribution is set to have an amplitude proportional to the speed of the image motion created by the eye movement (6.9°/s) and phase equal to the direction (180°). (c) Sum of the two distributions shown in (a) and (b). It has a peak and phase corresponding to the amplitude and direction of the vector sum of the two motions shown in (a) and (b).
The peak amplitude of the cosine distribution needs to be in proportion to the speed of the image motion at that location, i.e., the velocity of the image motion needs to be encoded by sets of neurons at this image location. This is not trivial; the problem of image velocity estimation by biological systems is a difficult one with a long history (Nakayama, 1985). We have made progress in this area (Perrone & Krauzlis, 2007) using models of motion processing based on neurons in the V1 and the Middle Temporal (MT/V5) areas of the primate brain (Perrone, 2004, 2005; Perrone & Thiele, 2002). At each image location we sample direction in 30° steps and for each direction we use a small set (6) of MT neurons tuned to a range of spatial frequencies and speeds to extract the speed of motion. This gives us an estimate of the velocity at each location. The cosine directional tuning of the MT input units (Albright, 1984) means that the velocity sensor array produces an output distribution that is cosine. The amplitude of the cosine corresponds to the speed of image motion at that location and the phase corresponds to the direction. 
The details of how the cosine tuning arises is not critical to our model and outside the scope of the problem addressed by the new CD/EC theory described here. There is ample evidence of cosine tuning for both sensory and motor-related neural activity, including the coding of visual motion direction by neurons in MT (Albright, 1984), and the generation of velocity commands for pursuit eye movements by Purkinje cells in the cerebellum (Krauzlis & Lisberger, 1996). Here, we will simply assume that a directional array of velocity sensors exists at (x, y), tuned to a range of directions (0° to 360° in 30° steps). We will further assume that this array outputs a cosine distribution of activity with amplitude VT+R and phase θT+R. Instead of the Y axis on the cosine distribution plots representing the cosine projection value as in Figure 4, the Y axes in Figure 5a depict the activity from a particular velocity sensor tuned to direction ϕi. For actual neurons, the neural activity cannot be negative as illustrated in the cosine curves but the positive and negative values could be coded for using two ‘opponent’ sets of velocity sensors in a similar manner to the ‘on’ and ‘off’ systems proposed for neurons in the earlier stages of the visual system (Hubel & Wiesel, 1962; Kuffler, 1952). If at each location (x, y) there exists a velocity sensor v1 tuned to direction ϕ and another (v2) tuned to ϕ + 180° then one ‘channel’ could code for v1v2 and another for v2v1. Prior to the addition of the activity from the two channels, the two outputs could be half-wave rectified and the inverted polarity of the v2v1 channel output could be corrected by use of an inhibitory interneuron. This ‘on’ and ‘off’ system would enable the negative parts of the cosine distributions to be represented as neural activity. 
Given the above representation of image motion in the form of a cosine distribution of activity, the removal of the eye rotation vector becomes straightforward. We simply add a cosine distribution of activity to the distribution of activity generated by the different velocity estimators. The amplitude of the distribution needs to be in proportion to the rate of eye movement and the phase of the distribution needs to be equal to the direction of the eye movement and opposite the direction of the image motion (− R) ( Figure 5b). The addition of these two distributions results in a new distribution corresponding to what would have been produced by the T vector alone ( Figure 5c). We have effectively removed the effect of the eye rotation from the combined T + R motion. It is equivalent to performing the vector operation T + R + − R, but we believe this is a more physiologically plausible operation to the trigonometric approach shown in Figure 3b. Our method simply requires that certain fixed levels of activity (both excitatory and inhibitory) are added into the outputs of an array of motion analyzers (velocity encoders) at each retinal location ( x i, y i) in response to (or in anticipation of) a particular eye movement. We are suggesting that the ‘efference copy’ or ‘corollary discharge’ signal used in the CD/EC theories (Sperry, 1950; von Holst & Mittelstaedt, 1950) consists of a cosine distribution of activity and that this distribution is added to the activity being generated by a directional set of velocity sensors. 
An animated movie demonstrating the new CD/EC mechanism at work is shown in Figure 6. The original distribution of activity points upward and corresponds to the distribution of activity from a set of velocity encoders in response to motion of a vertical moving image feature. This represents the combined translation and eye rotation produced image motion ( T + R). As the eyes rotate, a cosine distribution of activity proportional to the eye velocity (− R) is added to the T + R distribution and the new distribution acquires an orientation with its main axes tilted to the left. The new distribution's orientation and amplitude correspond to the image motion that would have occurred if the eye movement had not taken place ( T). Notice that the activity in each direction is simply being increased or decreased (the lines are growing and shrinking), yet the overall distribution changes direction. These ‘local’ changes in activity enable the equivalent of vector subtraction to take place at this image location. The mechanism works for a range of image motion velocities and has more power than the basic ‘cancellation’ mechanism described in the original CD/EC theories. It was never clear how these systems dealt with velocity flow fields containing multiple speeds and directions. 
 
Figure 6
 
Animated movie demonstrating vector addition using cosine distributions. The original distribution has a phase angle corresponding to vertical (90°) and it shifts to the left (120°) as the efference copy distribution is added to it. Note that the directional shift is caused by changes to the length of the individual bars in the figure, not by rotation of the lines (use the slider bar in the movie player to slow down the animation).
Vector subtraction before or after integration?
The above description of the vector addition/subtraction operation implies that the cosine distribution of activity from the CD/EC source is added locally at each image location. Under this scheme, the vector operation would precede the integration of the motion signals across the visual field. The signal would be ‘corrected’ prior to being summed by the type of full-field self-motion analyzers we have proposed previously (Perrone, 1992; Perrone & Stone, 1994, 1998). There is some evidence that this may not be the strategy adopted by the human visual system. Beintema and van den Berg have argued that the compensation for eye movements must occur after the spatial integration stage and presented psychophysical results to support their position (Beintema & van den Berg, 2001; van den Berg & Beintema, 2000). By measuring the precision of heading judgements made by their observers under pure translation and translation plus pursuit conditions, they concluded that basic local ‘vector subtraction’ cannot be occurring and that the eye movement compensation must be occurring after the motion has been integrated across the visual field. Their data are problematical for self-motion estimation schemes that rely on local vector subtraction (Banks, Ehrlich, Backus, & Crowell, 1996; Royden, 2002; Royden et al., 1992; Royden, Crowell, & Banks, 1994) and seem to preclude the type of vector subtraction we are proposing in this paper. It turns out however that our scheme is not reliant on the vector operations being carried out prior to integration across the visual field. 
Figure 7a is a representation of an MST-like heading template that we have previously used to model heading estimation (Perrone, 1992; Perrone & Stone, 1994, 1998). The unit at the center integrates the motion information from velocity sensors at image locations distributed over a wide area of the visual field. It represents a cell tuned to a radial pattern of expanding motion (hence the ‘template’ label) or, equivalently, a cell tuned to a particular heading direction. The point of expansion marked by the circle is known as the ‘Focus of expansion’ (FOE; Gibson, 1950). Only one FOE position is shown in Figure 7a but heading is determined using many such units tuned to a range of heading directions (and different FOE positions). In the original heading models (Perrone, 1992; Perrone & Stone, 1994, 1998), any activity from MT-like motion sensors in the radial direction out from the FOE was summed. Since the MT sensors used in these earlier models are speed tuned and do not code velocity directly, the activity at a particular image location is not in proportion to the image velocity. Here we are assuming that a velocity signal proportional to T + R is available at (x, y) and are suggesting that the component of this velocity signal along the radial direction is summed by the heading detector unit. The component signal is readily available from the cosine distribution of activity generated by our velocity detector array (see Figure 5a) and it is present in the unit coding for the radial direction. 
Figure 7
 
Integration of motion information across the visual field by heading detectors. (a) Heading detector tuned to direction 0° azimuth and 0° elevation. The blue vectors indicate image motion generated during pure translation in the (0°, 0°) direction. The red vectors show the image motion caused by a pursuit eye movement to the right and the black vectors represent the vector sum (T + R). The heading detector sums the components of the motion vectors along the radial directions (gray lines). (b) Outputs from a set of heading templates tuned to a range of azimuth directions in response to the vectors shown in (a). The blue curve shows the response distribution when no eye rotation occurs (blue vectors). The template tuned to 0° azimuth responds the most and the correct heading is signaled. The black curve shows the heading detector output distribution when a pursuit signal occurs. Without some sort of compensation mechanism, the heading is incorrectly signaled as being 15° to the right.
Figure 7
 
Integration of motion information across the visual field by heading detectors. (a) Heading detector tuned to direction 0° azimuth and 0° elevation. The blue vectors indicate image motion generated during pure translation in the (0°, 0°) direction. The red vectors show the image motion caused by a pursuit eye movement to the right and the black vectors represent the vector sum (T + R). The heading detector sums the components of the motion vectors along the radial directions (gray lines). (b) Outputs from a set of heading templates tuned to a range of azimuth directions in response to the vectors shown in (a). The blue curve shows the response distribution when no eye rotation occurs (blue vectors). The template tuned to 0° azimuth responds the most and the correct heading is signaled. The black curve shows the heading detector output distribution when a pursuit signal occurs. Without some sort of compensation mechanism, the heading is incorrectly signaled as being 15° to the right.
Let three arbitrary image locations be represented by ( x 1, y 1), ( x 2, y 2), and ( x 3, y 3) and the radial directions out from the FOE location (see gray lines) as ϕ 1, ϕ 2, and ϕ 3. The blue vectors ( T) represent the image velocity generated by an observer moving in the direction of the FOE (assumed to be 0° azimuth and 0° elevation in the figure). The red vectors ( R) are the image velocity vectors produced by a pursuit eye movement to the right and T + R represents the combined retinal image motion that occurs when the forward translation and pursuit occur at the same time. Let the angles of the T + R vectors at each position be θ 1, θ 2, and θ 3 and their lengths be v 1, v 2, and v 3. If no eye movement compensation is in place, the heading detector in the figure sums the component of the T + R vectors in the radial directions. The component for position 1 is indicated by the distance from ( x 1, y 1) to the point at which the dashed black line meets the T vector direction. 
A heading detector tuned to a different direction ( α i, β i) will have different values for ϕ 1, ϕ 2, and ϕ 3 and so the radial components will be different as well. The blue curve in Figure 7b shows the total activity summed across the three blue vectors for a number of such heading detectors, tuned to a range of different azimuth directions (−40° to +40°). This is for the case when no pursuit occurs and the peak of the curve occurs in the heading detector tuned to 0° azimuth. The correct heading direction is indicated by the heading detector in the array with the largest output. The black curve is for the case in which a pursuit eye movement occurs during the translation of the observer and represents the total activity from the black vectors. Notice that the peak in the array of detectors now incorrectly signals that the heading is 15° to the right. Summation of the three T + R components results in an incorrect heading estimate because the components of T + R along the ϕ 1, ϕ 2, and ϕ 3 directions are not the same as those for T. The pursuit rotation has added an additional component to each of the T vectors. 
One method of correcting for the pursuit rotation is shown in Figure 8a. For the FOE position shown in Figure 8a, we can represent the total amount of activity in a particular heading detector when no correction (NC) is made for the presence of R as  
H N C = v 1 cos ( ϕ 1 θ 1 ) + v 2 cos ( ϕ 2 θ 2 ) + v 3 cos ( ϕ 3 θ 3 ) .
(9)
It is possible to correct for the rotation at each image location by adding the projection of the − R vector onto the radial direction axes (see dashed lines in Figure 8a). This is equivalent to what was carried out in the vector addition demonstration above ( Figure 4) but only three projection axes are being considered here ( ϕ 1, ϕ 2, and ϕ 3). If vector R has speed and direction ( v R, θ R), the corrected (C) total heading activity for the detector shown in Figure 8a is now given by  
H C = v 1 cos ( ϕ 1 θ 1 ) v R cos ( ϕ 1 θ R ) + v 2 cos ( ϕ 2 θ 2 ) v R cos ( ϕ 2 θ R ) + v 3 cos ( ϕ 3 θ 3 ) v R cos ( ϕ 3 θ R ) .
(10)
Each subtraction removes the R component along the radial direction and ensures that the correct velocity component (equal to T) is now summed by the heading detector. This particular heading detector now responds maximally and signals the correct heading (see Figure 8b). 
Figure 8
 
Compensation for eye movements during forward translation using an efference copy/corollary discharge signal. (a) Local vector subtraction using cosine components. The projection of the vectors (see dashed lines) onto the radial lines is equal to the cosine component of the vectors along the radial directions. The component of the rotation vector ( R) is subtracted from the component of the T + R vector to give the original T vector. (b) Because the vector subtraction operation restores the original T vectors at each location, the heading detector output distribution is the same as that shown in Figure 7b and now peaks at the correct heading (0° azimuth). (c) Compensation after integration. A combined signal, E (equal to the sum of the − R components shown in (a)) is added to the summed activity of each heading detector. (d) Distribution of activity across heading detectors. Black line is the combined T + R distribution, which signals the incorrect heading direction. The red line is the E signal applied across the different heading detectors. The vertical dashed line shows the activity levels required for the detector shown in (c). When the black and red distributions are added, the resulting distribution (blue curve) is equal to the pure translation distribution ( T) and signals the correct heading.
Figure 8
 
Compensation for eye movements during forward translation using an efference copy/corollary discharge signal. (a) Local vector subtraction using cosine components. The projection of the vectors (see dashed lines) onto the radial lines is equal to the cosine component of the vectors along the radial directions. The component of the rotation vector ( R) is subtracted from the component of the T + R vector to give the original T vector. (b) Because the vector subtraction operation restores the original T vectors at each location, the heading detector output distribution is the same as that shown in Figure 7b and now peaks at the correct heading (0° azimuth). (c) Compensation after integration. A combined signal, E (equal to the sum of the − R components shown in (a)) is added to the summed activity of each heading detector. (d) Distribution of activity across heading detectors. Black line is the combined T + R distribution, which signals the incorrect heading direction. The red line is the E signal applied across the different heading detectors. The vertical dashed line shows the activity levels required for the detector shown in (c). When the black and red distributions are added, the resulting distribution (blue curve) is equal to the pure translation distribution ( T) and signals the correct heading.
Notice that Equation 10 can be rearranged as  
H C = [ v 1 cos ( ϕ 1 θ 1 ) + v 2 cos ( ϕ 2 θ 2 ) + v 3 cos ( ϕ 3 θ 3 ) ] v R [ cos ( ϕ 1 θ R ) + cos ( ϕ 2 θ R ) + cos ( ϕ 3 θ R ) ] .
(11)
The first set of square bracketed items represents the uncorrected summed activity and is equivalent to that shown in Equation 9. The next set ( R components) at the end of Equation 11 are all known once the pursuit velocity is known. This is because for a given heading direction (i.e., a particular heading detector location) and for a particular image location ( x i, y i), the value of ϕ i can be found by calculating the direction of the image location relative to the detector location. Since V R and θ R are known, it is straightforward to calculate the size of the projected − R component (cosine component) for a particular pursuit velocity. Therefore a fixed amount of total E (efference/corollary signal) activity can be added to the uncorrected total activity within the heading detector and Equation 11 shows that the same total output is produced. This operation is depicted as the addition of a neural signal E( α, β, V R) to the output of the heading detector tuned to heading direction ( α, β) in Figure 8c. The size of E is a function of the heading tuning of the detector ( α, β) and the pursuit velocity ( V R) only. It does not depend on the size of T
The red curve in Figure 8d shows the different amounts of E activity applied to the whole range of heading detectors. For the heading detector tuned to (0°, 0°), only a small amount of positive efference/corollary signal needs to be added to the total activity from the three T + R vectors (see vertical dashed line). For other heading detectors, the amount of inhibitory or excitatory E activity is higher. When this distribution of activity is added to the ‘uncorrected’ T + R distribution (black curve), the result is a distribution corresponding to that produced by the T vectors alone (blue curve) and the correct heading is once again signaled by the heading detector with the largest output (0°, 0°). In this case, however, the correction has been applied after the integration stage and no ‘local’ vector subtraction has occurred (cf. Figures 8a and 8b). Notice that the distributions depicted in Figure 8d have a negative component. This requires that the heading detectors have an ‘opponent’ stage included in their design in order to signal this negative neural activity (see discussion of “on” and “off” systems above). 
Because our new vector subtraction system could be implemented either before or after the motion inputs are spatially integrated, our model is agnostic about the particular cortical locus of the subtraction stage. In principle, the vector subtraction could occur locally prior to integration across space ( Figures 8a and 8b)—for example, at the level of our velocity estimators, which use MT inputs. More likely, as shown by Equation 11, the rotation removal stage could be separated out and applied at a point after which the local (uncorrected) velocities have been integrated ( Figures 8c and 8d)—for example, at the level of MST or other brain areas that receive inputs from MST (e.g., area VIP). The vector subtraction system we have proposed is therefore compatible with the psychophysical results indicating that eye velocity and visual signals interact at a later stage of processing possibly occurring in area MST (Beintema & van den Berg, 2001; van den Berg & Beintema, 2000). Independent of where the correction takes place, the activity across a population of the integration neurons (see Figure 8) would have the confounding eye rotation signal removed, and the maximum activity across the set of neurons would correctly signal the global motion identifying the direction of heading. 
Discussion
The execution of eye movements in conjunction with locomotion through the environment creates complex patterns of retinal image motion (Gibson, 1950; Koenderink & van Doorn, 1975; Longuet-Higgins & Prazdny, 1980; Nakayama & Loomis, 1974). In order to recover relevant information from these motion patterns (such as one's heading direction) the image motion generated by the eye movements must be removed. The image motion induced by eye movement must be subtracted following the rules of vector algebra. Human psychophysical tests of heading estimation show that the primate brain is capable of carrying out this vector subtraction (Freeman et al., 2000; Li & Warren, 2000; Royden et al., 1992; Warren & Hannon, 1988) but the underlying neural mechanisms remain unknown. We have provided a ‘proof of principle’ demonstrating that a form of vector subtraction can be implemented using cosine distributions of activity from velocity sensors and from oculomotor sources. 
We suggest that the vector subtraction mechanism takes place after the MT stage of motion processing. The reason for this is that MT neurons are speed tuned only (Lagae, Raiguel, & Orban, 1993; Maunsell & Van Essen, 1983; Perrone & Thiele, 2001; Priebe, Cassanello, & Lisberger, 2003) and an individual MT neuron cannot provide the velocity signal proportional to the length of T + R that is required by the new mechanism. Such a velocity signal would be available at a stage after the signals from a number of MT neurons are combined, such as area MST. This is also consistent with the predominance of pursuit signals in MST compared to MT (Erikson & Thier, 1991; Komatsu & Wurtz, 1988; Newsome et al., 1988). 
Our idea of modifying the signal distributions in maps of neurons downstream of area MT (e.g., MST, VIP) follows earlier similar suggestions (Beintema & van den Berg, 1998; Bradley et al., 1996; Perrone & Stone, 1994) but we have specifically tied our compensation mechanism to the concept of vector subtraction. At the level of MST, our model shares many similarities with the ‘velocity gain field’ model of Beintema and van den Berg (1998). However their eye movement ‘correction’ distributions are based on a Taylor series approximation and derivatives rather than cosine distributions. Also, the interaction between their gain fields and the uncorrected heading map is multiplicative whereas we use addition and subtraction. We believe that our system is more precise because it is based on the vector algebra that determines the retinal image velocities present in the visual field rather than derivatives of the heading map distributions. However a proper comparison of the performance of the two models cannot be made until suitable ‘front-end’ velocity encoders are developed. Both models rely on the assumption that a velocity signal is available at each image location and so their performance with realistic inputs cannot be assessed until this stage is implemented. 
We have assumed in the development of our model above that the amplitude of the CD/EC cosine distribution [ D( R)] is exactly equal to the speed of image motion generated by the eye movement (i.e., the gain = 1). This would produce perfect subtraction of the R vector. However it is well established that, under a wide range of conditions, the cancellation of eye rotation induced motion is imperfect (e.g., Freeman & Banks, 1998; Freeman et al., 2000; Mack & Herman, 1973; Turano & Heidenreich, 1996; Wertheim, 1987) and that the gain of the extraretinal signal can be variable (Haarmeier et al., 2001). Effects such as the Filehne illusion (Filehne, 1922; Mack & Herman, 1973) and the Aubert–Fleischl phenomenon (Dichgans, Wist, Diener, & Brandt, 1975) would arise in our model from a less than optimum value being used for the amplitude parameter of the D(R) distribution. This raises the question of how would the visual system ‘calibrate’ a mechanism such as the one we are proposing. The motor signal used to drive the eye needs to be correctly combined with the visual motion signals for the system to work. The CD/EC cosine distribution must be applied with the correct amplitude, but it is not obvious how to initially scale the motor signals so that they match the visual signals. The two signal generators do not share a common coordinate system. For the post-integration scheme depicted in Figures 8c and 8d, the amplitude of the efference/corollary distribution (red curve) is dependent upon the number of vectors in the flow field. This is unknown to the pursuit system and so the amplitude needs to be modulated somehow by the strength of the visual motion signals present. One possibility that has been suggested is to use feedback from visually generated image motion (Haarmeier et al., 2001). However it is not a trivial matter deciding how the visual signals should be combined with the extraretinal signals. An insight into the complexity of the problem can be found in data demonstrating non-linear interactions between extraretinal signals and retinal flow (Crowell & Andersen, 2001; van den Berg et al., 2001). One option we are exploring is to detect and measure the visual rotation components using full-field ‘rotation detectors’ (e.g., Perrone, 1992) and to use this visual signal to control the amplitude of the D(R) cosine in our model. We are currently testing the impact that such a feedback loop would have on our vector subtraction system. 
The mechanism we have outlined is generic in the sense that once the signals are in the form of cosine distributions of activity it is possible to remove the effects of other extraretinal sources, not just those created by eye movements. For example, image motion created by head movements could also be removed in this fashion. Head turns have been shown to influence human estimates of self-motion (Crowell, Banks, Shenoy, & Andersen, 1998) so neck proprioception signals could also be included in the rotation cancellation mechanism. Vestibular signals are present in MST (Bremmer, Kubischik, Pekel, Lappe, & Hoffmann, 1999; Duffy, 1998; Fetsch, Wang, Gu, DeAngelis, & Angelaki, 2007; Page & Duffy, 2003; Thier & Erikson, 1992). The neck proprioception and vestibular signals could be added to the oculomotor and visual signals, once they are transformed into a ‘common currency’ based on a cosine distribution of activity. 
Cosine distributions of activity have been observed and used in several other domains. For example, Wylie et al. (1998) found translational optic flow neurons in the pigeon brain that display cosine tuning curves. Pouget and Sejnowski (1997) discussed the use of cosine weighting for carrying out spatial transformations in parietal cortex. Georgopoulos et al. (1986) used cosine weighting to describe a population vector code model of movement direction in primate motor cortex. Krauzlis and Lisberger (1996) found that Purkinje cells in the ventral paraflocculus prefer only two primary directions of pursuit eye velocity (ipsiversive and downward), but these neurons nonetheless code for all directions of pursuit because of their cosine tuning. We have now demonstrated how cosine tuning distributions can be used in the context of primate visual motion processing and how they can be used for carrying out vector addition/subtraction operations on the motion signals associated with body and eye movements. We have also shown where these vector operations could take place along the V1–MT–MST motion pathway hierarchy. The ‘comparator’ unit in the traditional efference copy/corollary discharge theory diagram (see Figure 1) has for a long time been depicted as an empty circle. Our new theory has added some potential detail to this blank entity. 
Acknowledgments
Supported by the Marsden Fund Council from Government funding, administered by the Royal Society of New Zealand. 
Commercial relationships: none. 
Corresponding author: Assoc. Prof. John Perrone. 
Email: jpnz@waikato.ac.nz. 
Address: Psychology Department, The University of Waikato, Private Bag 3105, Hamilton 3240, New Zealand. 
References
Albright, T. D. (1984). Direction and orientation selectivity of neurons in visual area MT of the macaque. Journal of Neurophysiology, 52, 1106–1130. [PubMed] [PubMed]
Banks, M. S. Ehrlich, S. M. Crowell, J. A. (1996). Estimating heading during real and simulated eye movements. Vision Research, 36, 431–443. [PubMed] [CrossRef] [PubMed]
Beintema, J. A. van den Berg, A. V. (1998). Heading detection using motion templates and eye velocity gain fields. Vision Research, 38, 2155–2179. [PubMed] [CrossRef] [PubMed]
Beintema, J. A. van den Berg, A. V. (2001). Pursuit affects precision of perceived heading for small viewing apertures. Vision Research, 41, 2375–2391. [PubMed] [CrossRef] [PubMed]
Bradley, D. C. Maxwell, M. Andersen, R. A. Banks, M. S. Shenoy, K. V. (1996). Mechanisms of heading perception in primate visual cortex. Science, 273, 1544–1547. [PubMed] [CrossRef] [PubMed]
Bremmer, F. Kubischik, M. Pekel, M. Lappe, M. Hoffmann, K. P. (1999). Linear vestibular self-motion signals in monkey medial superior temporal area. Annals of the New York Academy of Sciences, 871, 272–281. [PubMed] [CrossRef] [PubMed]
Britten, K. H. van Wezel, R. J. (1998). Electrical microstimulation of cortical area MST biases heading perception in monkeys. Nature Neuroscience, 1, 59–63. [PubMed] [CrossRef] [PubMed]
Crowell, J. A. Andersen, R. A. (2001). Pursuit compensation during self-motion. Perception, 30, 1465–1488. [PubMed] [CrossRef] [PubMed]
Crowell, J. A. Banks, M. S. Shenoy, K. V. Andersen, R. A. (1998). Visual self-motion perception during head turns. Nature Neuroscience, 1, 732–737. [PubMed] [CrossRef] [PubMed]
Cutting, J. E. Springer, K. Braren, P. A. Johnson, S. H. (1992). Wayfinding on foot from information in retinal, not optical, flow. Journal of Experimental Psychology: General, 121, 41–72. [PubMed] [CrossRef] [PubMed]
Dichgans, J. Wist, E. Diener, H. C. Brandt, T. (1975). The Aubert–Fleischl phenomenon: A temporal frequency effect on perceived velocity in afferent motion perception. Experimental Brain Research, 23, 529–533. [PubMed] [CrossRef] [PubMed]
Duffy, C. J. (1998). MST neurons respond to optic flow and translational movement. Journal of Neurophysiology, 80, 1816–1827. [PubMed] [Article] [PubMed]
Duffy, C. J. Wurtz, R. H. (1991). Sensitivity of MST neurons to optic flow stimuli I A continuum of response selectivity to large-field stimuli. Journal of Neurophysiology, 65, 1329–1345. [PubMed] [PubMed]
Eckert, M. P. Buchsbaum, G. (1993). Efficient coding of natural time varying images in the early visual system. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 339, 385–395. [PubMed] [Article] [CrossRef]
Erikson, R. G. Thier, P. (1991). A neuronal correlate of spatial stability during periods of self-induced visual motion. Experimental Brain Research, 86, 608–616. [PubMed] [PubMed]
Fetsch, C. R. Wang, S. Gu, Y. DeAngelis, G. C. Angelaki, D. E. (2007). Spatial reference frames of visual, vestibular, and multimodal heading signals in the dorsal subdivision of the medial superior temporal area. Journal of Neuroscience, 27, 700–712. [PubMed] [Article] [CrossRef] [PubMed]
Filehne, W. (1922). Uber das optische Wahrnehmen von Bewegungen. Zeitschrift fur Sinnephysiologie, 53, 134–145.
Freeman, T. C. Banks, M. S. (1998). Perceived head-centric speed is affected by both extra-retinal and retinal errors. Vision Research, 38, 941–945. [PubMed] [CrossRef] [PubMed]
Freeman, T. C. Banks, M. S. Crowell, J. A. (2000). Extraretinal and retinal amplitude and phase errors during Filehne illusion and path perception. Perception & Psychophysics, 62, 900–909. [PubMed] [CrossRef] [PubMed]
Furman, M. Gur, M. (2003). Self-organizing neural network model of motion processing in the visual cortex during smooth pursuit. Vision Research, 43, 2155–2171. [PubMed] [CrossRef] [PubMed]
Georgopoulos, A. P. Schwartz, A. B. Kettner, R. E. (1986). Neuronal population coding of movement direction. Science, 233, 1416–1419. [PubMed] [CrossRef] [PubMed]
Gibson, J. J. (1950). The perception of the visual world. Boston: Houghton Mifflin.
Goldstein, E. B. (2007). Sensation and perception. Belmont, CA: Thompson Wadsworth.
Haarmeier, T. Bunjes, F. Lindner, A. Berret, E. Thier, P. (2001). Optimizing visual motion perception during eye movements. Neuron, 32, 527–535. [PubMed] [Article] [CrossRef] [PubMed]
Haarmeier, T. Thier, P. Repnow, M. Petersen, D. (1997). False perception of motion in a patient who cannot compensate for eye movements. Nature, 389, 849–852. [PubMed] [CrossRef] [PubMed]
Helmholtz, H. V. (1925). In Southall (Ed.), Helmholtz's treatise on physiological optics. III. New York: Optical Society of America.
Hubel, D. H. Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of Physiology, 160, 106–154. [PubMed] [Article] [CrossRef] [PubMed]
Inaba, N. Shinomoto, S. Yamane, S. Takemura, A. Kawano, K. (2007). MST neurons code for visual motion in space independent of pursuit eye movements. Journal of Neurophysiology, 97, 3473–3483. [PubMed] [Article] [CrossRef] [PubMed]
Koenderink, J. J. van Doorn, A. J. (1975). Invariant properties of the motion parallax field due to the movement of rigid bodies relative to an observer. Optica Acta, 22, 773–791. [CrossRef]
Komatsu, H. Wurtz, R. H. (1988). Relation of cortical areas MT and MST to pursuit eye movements I Localization and visual properties of neurons. Journal of Neurophysiology, 60, 580–603. [PubMed] [PubMed]
Krauzlis, R. J. Lisberger, S. G. (1996). Directional organization of eye movement and visual signals in the floccular lobe of the monkey cerebellum. Experimental Brain Research, 109, 289–302. [PubMed] [CrossRef] [PubMed]
Kuffler, S. W. (1952). Neurons in the retina; organization, inhibition and excitation problems. Cold Spring Harbor Symposia on Quantitative Biology, 17, 281–292. [PubMed] [CrossRef] [PubMed]
Lagae, L. Raiguel, S. Orban, G. A. (1993). Speed and direction selectivity of macaque middle temporal neurons. Journal of Neurophysiology, 69, 19–39. [PubMed] [PubMed]
Lappe, M. (1998). A model of the combination of optic flow and extraretinal eye movement signals in primate extrastriate visual cortex Neural model of self-motion from optic flow and extraretinal cues. Neural Networks, 11, 397–414. [PubMed] [CrossRef] [PubMed]
Lee, B. Pesaran, B. Andersen, R. A. (2007). Translation speed compensation in the dorsal aspect of the medial superior temporal area. Journal of Neuroscience, 27, 2582–2591. [PubMed] [Article] [CrossRef] [PubMed]
Li, L. Sweet, B. T. Stone, L. S. (2006). Journal of Vision, 6, (9):2, 874–881, http://journalofvision.org/6/9/2/, doi:10.1167/6.9.2. [PubMed] [Article] [CrossRef]
Li, L. Warren, Jr., W. H. (2000). Perception of heading during rotation: Sufficiency of dense motion parallax and reference objects. Vision Research, 40, 3873–3894. [PubMed] [CrossRef] [PubMed]
Longuet-Higgins, H. C. Prazdny, K. (1980). The interpretation of a moving retinal images. Proceedings of the Royal Society of London B: Biological Sciences, 208, 385–387. [PubMed] [CrossRef]
Mack, A. Herman, E. (1973). Position constancy during pursuit eye movement: An investigation of the Filehne illusion. Quarterly Journal of Experimental Psychology, 25, 71–84. [PubMed] [CrossRef] [PubMed]
Maunsell, J. H. Van Essen, D. C. (1983). Functional properties of neurons in middle temporal visual area of the macaque monkey I Selectivity for stimulus direction, speed, and orientation. Journal of Neurophysiology, 49, 1127–1147. [PubMed] [PubMed]
Nakayama, K. (1985). Biological image motion processing: A review. Vision Research, 25, 625–660. [PubMed] [CrossRef] [PubMed]
Nakayama, K. Loomis, J. M. (1974). Optical velocity patterns, velocity-sensitive neurons, and space perception: A hypothesis. Perception, 3, 63–80. [PubMed] [CrossRef] [PubMed]
Newsome, W. T. Wurtz, R. H. Komatsu, H. (1988). Relation of cortical areas MT and MST to pursuit eye movements II Differentiation of retinal from extraretinal inputs. Journal of Neurophysiology, 60, 604–620. [PubMed] [PubMed]
Pack, C. Grossberg, S. Mingolla, E. (2001). A neural model of smooth pursuit control and motion perception by cortical area MST. Journal of Cognitive Neuroscience, 13, 102–120. [PubMed] [CrossRef] [PubMed]
Page, W. K. Duffy, C. J. (1999). MST neuronal responses to heading direction during pursuit eye movements. Journal of Neurophysiology, 81, 596–610. [PubMed] [Article] [PubMed]
Page, W. K. Duffy, C. J. (2003). Heading representation in MST: Sensory interactions and population encoding. Journal of Neurophysiology, 89, 1994–2013. [PubMed] [Article] [CrossRef] [PubMed]
Perrone, J. A. (1992). Model for the computation of self-motion in biological systems. Journal of the Optical Society of America A, Optics and Image Science, 9, 177–194. [PubMed] [CrossRef] [PubMed]
Perrone, J. A. (2004). A visual motion sensor based on the properties of V1 and MT neurons. Vision Research, 44, 1733–1755. [PubMed] [CrossRef] [PubMed]
Perrone, J. A. (2005). Economy of scale: A motion sensor with variable speed tuning. Journal of Vision, 5, (1):3, 28–33, http://journalofvision.org/5/1/3/, doi:10.1167/5.1.3. [PubMed] [Article] [CrossRef]
Perrone, J. A. Krauzlis, R. J. (2007). Image velocity estimation based on vector averaging of MT neuron responses: The problem of spatial scale [Abstract]. Journal of Vision, 7, (9):776, [CrossRef]
Perrone, J. A. Stone, L. S. (1994). A model of self-motion estimation within primate extrastriate visual cortex. Vision Research, 34, 2917–2938. [PubMed] [CrossRef] [PubMed]
Perrone, J. A. Stone, L. S. (1998). Emulating the visual receptive-field properties of MST neurons with a template model of heading estimation. Journal of Neuroscience, 18, 5958–5975. [PubMed] [Article] [PubMed]
Perrone, J. A. Thiele, A. (2001). Speed skills: Measuring the visual speed analyzing properties of primate MT neurons. Nature Neuroscience, 4, 526–532. [PubMed] [PubMed]
Perrone, J. A. Thiele, A. (2002). A model of speed tuning in MT neurons. Vision Research, 42, 1035–1051. [PubMed] [CrossRef] [PubMed]
Pouget, A. Sejnowski, T. J. (1997). Spatial transformations in the parietal cortex using basis functions. Journal of Cognitive Neuroscience, 9, 222–237. [CrossRef] [PubMed]
Priebe, N. J. Cassanello, C. R. Lisberger, S. G. (2003). The neural representation of speed in macaque area MT/V5. Journal of Neuroscience, 23, 5650–5661. [PubMed] [Article] [PubMed]
Regan, D. Beverley, K. I. (1982). How do we avoid confounding the direction we are looking and the direction we are moving? Science, 215, 194–196. [PubMed] [CrossRef] [PubMed]
Royden, C. S. (2002). Computing heading in the presence of moving objects: A model that uses motion-opponent operators. Vision Research, 42, 3043–3058. [PubMed] [CrossRef] [PubMed]
Royden, C. S. Banks, M. S. Crowell, J. A. (1992). The perception of heading during eye movements. Nature, 360, 583–585. [PubMed] [CrossRef] [PubMed]
Royden, C. S. Crowell, J. A. Banks, M. S. (1994). Estimating heading during eye-movements. Vision Research, 34, 3197–3214. [PubMed] [CrossRef] [PubMed]
Saito, H. Yukie, M. Tanaka, K. Hikosaka, K. Fukada, Y. Iwai, E. (1986). Integration of direction signals of image motion in the superior temporal sulcus of the macaque monkey. Journal of Neuroscience, 6, 145–157. [PubMed] [Article] [PubMed]
Shenoy, K. V. Bradley, D. C. Andersen, R. A. (1999). Influence of gaze rotation on the visual response of primate MSTd neurons. Journal of Neurophysiology, 81, 2764–2786. [PubMed] [Article] [PubMed]
Sperry, R. W. (1950). Neural basis of the spontaneous optokinetic response produced by visual inversion. Journal of Comparative and Physiological Psychology, 43, 482–489. [PubMed] [CrossRef] [PubMed]
Stone, L. S. Perrone, J. A. (1997). Human heading estimation during visually simulated curvilinear motion. Vision Research, 37, 573–590. [PubMed] [CrossRef] [PubMed]
Tanaka, K. Hikosaka, K. Saito, H. Yukie, M. Fukada, Y. Iwai, E. (1986). Analysis of local and wide-field movements in the superior temporal visual areas of the macaque monkey. Journal of Neuroscience, 6, 134–144. [PubMed] [Article] [PubMed]
Thier, P. Erikson, R. G. (1992). Responses of visual-tracking neurons from cortical area MST-l to visual, eye and head motion. European Journal of Neuroscience, 4, 539–553. [PubMed] [CrossRef] [PubMed]
Turano, K. A. Heidenreich, S. M. (1996). Speed discrimination of distal stimuli during smooth pursuit eye motion. Vision Research, 36, 3507–3517. [PubMed] [CrossRef] [PubMed]
Upadhyay, U. D. Page, W. K. Duffy, C. J. (2000). MST responses to pursuit across optic flow with motion parallax. Journal of Neurophysiology, 84, 818–826. [PubMed] [Article] [PubMed]
van den Berg, A. V. (1993). Perception of heading. Nature, 365, 497–498. [PubMed] [CrossRef] [PubMed]
van den Berg, A. V. Beintema, J. A. (2000). The mechanism of interaction between visual flow and eye velocity signals for heading perception. Neuron, 26, 747–752. [PubMed] [Article] [CrossRef] [PubMed]
van den Berg, A. V. Beintema, J. A. Frens, M. A. (2001). Heading and path percepts from visual flow and eye pursuit signals. Vision Research, 41, 3467–3486. [PubMed] [CrossRef] [PubMed]
van Holst, E. (1954). Relations between the central nervous system and the peripheral organs. British Journal of Animal Behaviour, 2, 89–94. [CrossRef]
von Holst, E. Mittelstaedt, H. (1950). Das Reafferencprinczip. Naturwissenschaften, 37, 464–476. [CrossRef]
Warren, W. H. Hannon, D. J. (1988). Direction of self-motion is perceived from optical flow. Nature, 336, 162–163. [CrossRef]
Wertheim, A. H. (1987). Retinal and extraretinal information in movement perception: How to invert the Filehne illusion. Perception, 16, 299–308. [PubMed] [CrossRef] [PubMed]
Wilkie, R. M. Wann, J. P. (2006). Judgments of path, not heading, guide locomotion. Journal of Experimental psychology: Human Perception and Performance, 32, 88–96. [PubMed] [CrossRef] [PubMed]
Wylie, D. R. Bischof, W. F. Frost, B. J. (1998). Common reference frame for neuronal coding of translational and rotational optic flow. Nature, 392, 278–282. [PubMed] [CrossRef] [PubMed]
Zhang, T. Heuer, H. W. Britten, K. H. (2004). Parietal area VIP neuronal responses to heading stimuli are encoded in head-centered coordinates. Neuron, 42, 993–1001. [PubMed] [Article] [CrossRef] [PubMed]
Figure 1
 
Standard block diagram representing the corollary discharge/efference copy theory. A copy of the motor signal sent to move the eye is also sent to a comparator unit. If the retinal motion signal (bottom of figure) arrives at the same time as the CD/EC signal, cancellation occurs and no motion is perceived.
Figure 1
 
Standard block diagram representing the corollary discharge/efference copy theory. A copy of the motor signal sent to move the eye is also sent to a comparator unit. If the retinal motion signal (bottom of figure) arrives at the same time as the CD/EC signal, cancellation occurs and no motion is perceived.
Figure 2
 
Complexity introduced by motion of the observer. (a) Representation of the retinal image motion caused by a pursuit eye movement to the right while observing static isolated points in the world. The flow field is largely uniform in speed and direction. (b) Image motion generated during forward translation of the observer with no eye movement. The image motion radiates out from a point in the middle of the visual field coinciding with the direction of heading. (c) Image motion generated during simultaneous forward translation and smooth pursuit to the right. The image motion is made up of the vector sum of the vectors in (a) and (b) and contains multiple directions and speeds.
Figure 2
 
Complexity introduced by motion of the observer. (a) Representation of the retinal image motion caused by a pursuit eye movement to the right while observing static isolated points in the world. The flow field is largely uniform in speed and direction. (b) Image motion generated during forward translation of the observer with no eye movement. The image motion radiates out from a point in the middle of the visual field coinciding with the direction of heading. (c) Image motion generated during simultaneous forward translation and smooth pursuit to the right. The image motion is made up of the vector sum of the vectors in (a) and (b) and contains multiple directions and speeds.
Figure 3
 
(a) The eye-rotation problem in vector form. The image motion generated by the eye movement ( R) is vector added to the motion generated by observer translation ( T) to give S. R must be subtracted from S to recover T. (b) Standard vector addition. Each arrow represents an image velocity vector with speed and direction shown in brackets. The vector sum is derived from the projected components of the two vectors onto the X and Y axes (see dashed lines).
Figure 3
 
(a) The eye-rotation problem in vector form. The image motion generated by the eye movement ( R) is vector added to the motion generated by observer translation ( T) to give S. R must be subtracted from S to recover T. (b) Standard vector addition. Each arrow represents an image velocity vector with speed and direction shown in brackets. The vector sum is derived from the projected components of the two vectors onto the X and Y axes (see dashed lines).
Figure 4
 
An alternative form of carrying out vector addition. Each vector is projected onto a series of projection axes, which sample the full 360° range of directions. The projected component of the vector is plotted in Cartesian form on the right. (a) Vector components for A. Three example projections are shown on the left (see dashed lines). The vector component corresponds to the distance from the origin to the point where the dashed line meets the axis. (b) Vector component distribution for B vector. (c) If the distribution of vector components for A and B are summed, the new distribution has a peak amplitude and phase corresponding to the length and direction of the vector sum, A + B (see standard vector plot on the left).
Figure 4
 
An alternative form of carrying out vector addition. Each vector is projected onto a series of projection axes, which sample the full 360° range of directions. The projected component of the vector is plotted in Cartesian form on the right. (a) Vector components for A. Three example projections are shown on the left (see dashed lines). The vector component corresponds to the distance from the origin to the point where the dashed line meets the axis. (b) Vector component distribution for B vector. (c) If the distribution of vector components for A and B are summed, the new distribution has a peak amplitude and phase corresponding to the length and direction of the vector sum, A + B (see standard vector plot on the left).
Figure 5
 
Vector subtraction using visual motion and extraretinal signals. (a) Cosine distribution representing the output of a set of velocity sensors located at a particular image location. The overall image motion is in a 30° direction with a speed of 8°/s. The broad tuning of the motion sensors means that other directions are activated as well and we assume that the tuning is cosine. The flower-like inset represents the velocity outputs in polar plot form with the gray lines indicating inhibitory signals. (b) Cosine distribution of activity arising from an extraretinal (pursuit signal) source. The ‘efference copy/corollary discharge’ signal distribution is set to have an amplitude proportional to the speed of the image motion created by the eye movement (6.9°/s) and phase equal to the direction (180°). (c) Sum of the two distributions shown in (a) and (b). It has a peak and phase corresponding to the amplitude and direction of the vector sum of the two motions shown in (a) and (b).
Figure 5
 
Vector subtraction using visual motion and extraretinal signals. (a) Cosine distribution representing the output of a set of velocity sensors located at a particular image location. The overall image motion is in a 30° direction with a speed of 8°/s. The broad tuning of the motion sensors means that other directions are activated as well and we assume that the tuning is cosine. The flower-like inset represents the velocity outputs in polar plot form with the gray lines indicating inhibitory signals. (b) Cosine distribution of activity arising from an extraretinal (pursuit signal) source. The ‘efference copy/corollary discharge’ signal distribution is set to have an amplitude proportional to the speed of the image motion created by the eye movement (6.9°/s) and phase equal to the direction (180°). (c) Sum of the two distributions shown in (a) and (b). It has a peak and phase corresponding to the amplitude and direction of the vector sum of the two motions shown in (a) and (b).
Figure 7
 
Integration of motion information across the visual field by heading detectors. (a) Heading detector tuned to direction 0° azimuth and 0° elevation. The blue vectors indicate image motion generated during pure translation in the (0°, 0°) direction. The red vectors show the image motion caused by a pursuit eye movement to the right and the black vectors represent the vector sum (T + R). The heading detector sums the components of the motion vectors along the radial directions (gray lines). (b) Outputs from a set of heading templates tuned to a range of azimuth directions in response to the vectors shown in (a). The blue curve shows the response distribution when no eye rotation occurs (blue vectors). The template tuned to 0° azimuth responds the most and the correct heading is signaled. The black curve shows the heading detector output distribution when a pursuit signal occurs. Without some sort of compensation mechanism, the heading is incorrectly signaled as being 15° to the right.
Figure 7
 
Integration of motion information across the visual field by heading detectors. (a) Heading detector tuned to direction 0° azimuth and 0° elevation. The blue vectors indicate image motion generated during pure translation in the (0°, 0°) direction. The red vectors show the image motion caused by a pursuit eye movement to the right and the black vectors represent the vector sum (T + R). The heading detector sums the components of the motion vectors along the radial directions (gray lines). (b) Outputs from a set of heading templates tuned to a range of azimuth directions in response to the vectors shown in (a). The blue curve shows the response distribution when no eye rotation occurs (blue vectors). The template tuned to 0° azimuth responds the most and the correct heading is signaled. The black curve shows the heading detector output distribution when a pursuit signal occurs. Without some sort of compensation mechanism, the heading is incorrectly signaled as being 15° to the right.
Figure 8
 
Compensation for eye movements during forward translation using an efference copy/corollary discharge signal. (a) Local vector subtraction using cosine components. The projection of the vectors (see dashed lines) onto the radial lines is equal to the cosine component of the vectors along the radial directions. The component of the rotation vector ( R) is subtracted from the component of the T + R vector to give the original T vector. (b) Because the vector subtraction operation restores the original T vectors at each location, the heading detector output distribution is the same as that shown in Figure 7b and now peaks at the correct heading (0° azimuth). (c) Compensation after integration. A combined signal, E (equal to the sum of the − R components shown in (a)) is added to the summed activity of each heading detector. (d) Distribution of activity across heading detectors. Black line is the combined T + R distribution, which signals the incorrect heading direction. The red line is the E signal applied across the different heading detectors. The vertical dashed line shows the activity levels required for the detector shown in (c). When the black and red distributions are added, the resulting distribution (blue curve) is equal to the pure translation distribution ( T) and signals the correct heading.
Figure 8
 
Compensation for eye movements during forward translation using an efference copy/corollary discharge signal. (a) Local vector subtraction using cosine components. The projection of the vectors (see dashed lines) onto the radial lines is equal to the cosine component of the vectors along the radial directions. The component of the rotation vector ( R) is subtracted from the component of the T + R vector to give the original T vector. (b) Because the vector subtraction operation restores the original T vectors at each location, the heading detector output distribution is the same as that shown in Figure 7b and now peaks at the correct heading (0° azimuth). (c) Compensation after integration. A combined signal, E (equal to the sum of the − R components shown in (a)) is added to the summed activity of each heading detector. (d) Distribution of activity across heading detectors. Black line is the combined T + R distribution, which signals the incorrect heading direction. The red line is the E signal applied across the different heading detectors. The vertical dashed line shows the activity levels required for the detector shown in (c). When the black and red distributions are added, the resulting distribution (blue curve) is equal to the pure translation distribution ( T) and signals the correct heading.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×