Abstract
A sequence of images is convolved with a bank of filters tuned for orientation and spatial frequency. A two-max rule (i.e. determine the two largest responses) is applied to the response outputs at fixed time intervals. The outputs from the rule are required to be from two different oriented Gabor filters and have a similar spatial frequency. Zero-crossings are then extracted from these outputs. The zero-crossings from each of the two filters correspond to the velocity constraint lines used to compute the “intersection of constraints”. Tracking any intersecting zero-crossing over time corresponds to the velocity as predicted by the IOC. Over time these intersecting zero-crossings create motion streaks the length of which corresponds to the IOC speed and the orientation corresponds to the IOC direction. The Hough transform is used to identify these streaks because they appear as peaks in the Hough transform owing to the fact that they fall along similar oriented lines. Temporal frequency tuned surround suppression (end-stopped) filters encode these oriented streaks because they are tuned for line length and orientation. The temporal frequency tuning is matched to the line length providing a speed tuned response. The model can explain why stationary or non-coherent motion affects perceived motion; why most plaids are perceived to move in the IOC direction; why sometimes they are perceived to move in the vector average; why (under specific conditions) if the IOC or vector average is adapted out motion is perceived in the vector average direction, and vice versa; why motion is affected by non-linearities e.g. squaring; and the phenomena of “motion steaks”. The model is invariant with respect to both contrast and phase and appears to be consistent with the physiological observations of both V1 and MT neurons.