Detection of motion is a crucial component of visual processing. To probe the computations underlying motion perception, we created a new class of non-Fourier motion stimuli, characterized by their third- and fourth-order spatiotemporal correlations. As with other non-Fourier stimuli, they lack second-order correlations, and therefore their motion cannot be detected by standard Fourier mechanisms. Additionally, these stimuli lack pairwise spatiotemporal correlation of edges or flicker—and thus, also cannot be detected by extraction of one of these features, followed by standard motion analysis. Nevertheless, many of these stimuli produced apparent motion in human observers. The pattern of responses—i.e., which specific spatiotemporal correlations led to a percept of motion—was highly consistent across subjects. For many of these stimuli, inverting the overall contrast of the stimulus reversed the direction of apparent motion. This “reverse-phi” phenomenon challenges existing models, including models that correlate low-level features and gradient models. Our findings indicate that current knowledge of the computations underlying motion processing is as yet incomplete, and that understanding how high-order spatiotemporal correlations lead to motion percepts will illuminate the computations underlying early motion processing.

*N*×

*N*array of checks at coordinates (

*x, y*). In Step 1, checks in the first row (the checks (

*x,*1)) and the first column (the checks (1,

*y*)) of the texture are randomly assigned to black or white. In Step 2, the glider is placed on the corner of the texture so that it covers 4 checks: (1, 1), (1, 2), (2, 1), and (2, 2). Since (1, 1), (1, 2), and (2, 1) are already colored in Step 1, the fourth check (2, 2) can be determined by counting the total number of black checks among the 3 known checks. If the number is even, the check (2, 2) is colored white; if the number is odd, it is colored black. This way, the total number of black checks within the glider is even. In Step 3, the glider is moved by one unit along the

*x*-direction, and the method in Step 2 is used to determine the color of check (3, 2). The glider is now moved in successive one-check steps, until all checks (

*x,*2) are now colored. At this point, the glider is moved to the first column of the next row, and the process is repeated. The whole texture is made using this recursive method after initialization, and therefore we can be sure that within any glider, the total number of black checks is even. Note that the above construction can be carried out for gliders of other shapes, not just a 2 × 2 square.

*x, y*) to a three-dimensional spatiotemporal array of voxels (

*x, y, t*). Correspondingly, the defining glider is a set of three or four nearby spatiotemporal voxels. The movie (see Supplementary data) is then colored with black and white voxels, with the requirement that within any glider, the total number of black voxels must have a particular parity (even or odd). Checks that cannot be determined by the glider rule, such as the initial frame's checks or the boundary checks, are randomly assigned black or white. This process is presented in Figure 1, and formally described in 1. Note that in Figure 1 (and later figures), the voxels of the glider are shown by coloring several corners of a wireframe cube. That is, the wireframe cube represents a 2 × 2 × 2 region, and each of its colored corners represents a voxel in the glider. The three colored corners are three voxels that form the glider, with different colors indicating differences in time.

*x*-axis, with a spacetime displacement between the pairs (grouped in Figure 2A by dashed lines). Since the parity rule requires that an even number of these voxels are black, the only possibilities are that both pairs contain one black voxel and one white voxel, or that all of the voxels match within each pair (one black pair and one white pair, or all voxels black, or all voxels white). In the first case, there is an edge orthogonal to the

*x*-axis between each pair of voxels; in the second case, there is no edge at either pair of voxels. That is, an edge in one location and time requires a similarly oriented edge in another location and time. The result is an edge that propagates along a spacetime diagonal. In sum, a specific fourth-order correlation (defined by the glider) corresponds to propagation of an edge along a spacetime diagonal. This propagation of a feature yields standard non-Fourier motion. Note that had we used the odd-parity rule, the same pattern of correlations would be present, but some would be negative.

*x*–

*t*slices of the movies corresponding to the gliders in Figures 2A and 2B, and demonstrate that these fourth-order correlations induce a visually obvious diagonal structure in spacetime.

*x*-axis, and the glider of Figure 2B leads to a spacetime diagonal of correlated flicker. In sum, we have seen that if a glider consists of two parallel pairs of adjacent checks, it will generate a standard non-Fourier stimulus.

*not*consist of two parallel pairs of adjacent checks—the gliders that generate the stimuli we study here. One way to do this is to use a glider with four elements, but to choose the elements so that they do not form two parallel pairs (Figure 3A). Another way to do this is to use a glider with only three elements (Figure 3B). Figures 3C and 3D show the

*x*–

*t*slices of the corresponding movies. Figure 3C has no evident visual structure. Figure 3D has a visually obvious diagonal structure, but this does not arise from pairwise correlations. This is illustrated by the absence of correlations in Figure 3E and proved in 1.

*x*–

*t*plane cannot contain all voxels of the four-element glider shown in Figure 3A, and therefore none of the checks in the

*x*–

*t*slice are correlated, at any order. For this stimulus, statistical structure is

*only*present when double-layer slices at specific orientations are considered. Visual inspection of Figure 3D can also be misleading. For this and other three-element gliders, a visually evident diagonal structure is present in the appropriate spacetime plane. However, this structure is not based on pairwise correlations and thus is not available to Fourier mechanisms. Moreover, the kinds of spatiotemporal correlations that are present do not correspond to “flicker motion” or “edge motion” and, thus, would not be available to standard non-Fourier mechanisms either. This is shown in 1.

*t*(filled green circles), and the voxels in the plane at time

*t*+ 1 (filled blue circles). The centroid direction is the vector from the centroid of the voxels at time

*t*(open green circle), to the centroid of the voxels at time

*t*+ 1 (open blue circle). Note that for many gliders, the centroid motion direction may be oblique (Figures 4B–4D). The strategy of finding the centroid direction can be extended to gliders that span multiple time slices, by choosing the direction to be the vector that is the best fit to the centroids in each time slice in the least-squares sense.

*F*

_{even}> 0.5), while for the other parity (e.g., odd), percepts were systematically biased opposite to the centroid direction (

*F*

_{odd}< 0.5). To ask whether the in-centroid direction percept was stronger than the opposite-direction percept, we proceeded as follows.

*F*

_{even}− 0.5) > (0.5 −

*F*

_{odd}). Similarly, if the odd-parity stimulus elicited motion in the centroid direction, and this was stronger than the percept of motion in the opposite direction elicited by the even stimuli, we would have (

*F*

_{odd}− 0.5) > (0.5 −

*F*

_{even}). Both of these are equivalent to

*F*

_{even}+

*F*

_{odd}− 1 > 0. So, the index of whether centroid-direction perceived motion for one parity was stronger than the opposite-direction percept for the other parity is, whether

*F*

_{even}+

*F*

_{odd}− 1 is greater than 0.

*S,*based on

*F*

_{even}+

*F*

_{odd}− 1:

*F*from 0.5) than the parity that produced the opposite motion (as measured by the negative deviation of

*F*from 0.5). An index of 0 means that the percepts were equally strong, and a negative index means that the percept of centroid motion was weaker.

*F*

_{even}and

*F*

_{odd}for one glider) were inverted with respect to chance performance (0.5). That is, the change was made as follows:

*F*

_{surrogate-even}= 1 −

*F*

_{odd},

*F*

_{surrogate-odd}= 1 −

*F*

_{even}.

*S*

_{surrogate}) from these data sets, via Equation 1. The fraction of surrogate index values

*S*

_{surrogate}higher than the index constructed from the original data was used to estimate the probability that the observed value of the index

*S*could be due to chance.

*p*< 0.05, two-tailed). Of the 10 three-element gliders, 9 produced a consistent apparent motion percept. Of the 13 four-element gliders (excluding the negative control), 7 showed consistent apparent motion. Note that we are using the direction of centroid motion simply as a reference, and thus, these counts include all of the stimuli that elicited motion in the centroid direction or opposite to it, as long as it was consistent.

*t*is followed by an edge in the adjacent position at time

*t*+ 1. This stimulus generated the strongest motion percept among all four-element gliders with two elements at each time, although some of the stimuli with three elements at one time and one element at another time (Figure 6A) and many of the three-element gliders (Figure 5) generated motion percepts that were similar in strength.

*p*< 0.001) and four-element gliders (

*p*< 0.01), via the surrogate data method described in the Methods section.

*f*that is not merely quadratic, and (3) combining opponent mechanisms. As an example, we start with the same three-element glider in Figure 7C, and consider the nonlinearity

*f*(

*z*) =

*z*

^{3}. We denote the luminances of the three checks by

*c*

_{1},

*c*

_{2}, and

*c*

_{3}, where black is represented by +1 and white by −1. Note that

*f*(

*z*) = (

*c*

_{1}+

*c*

_{2}+

*c*

_{3})

^{3}contains a term

*c*

_{1}

*c*

_{2}

*c*

_{3}. This term effectively calculates the parity in the glider, since it is negative if there is an even number of black checks, and positive if there is an odd number. So, when the glider is placed on an even stimulus, this term is always negative. Note that this only happens because the three checks are constrained by the glider; if

*z*summed any other set of three checks, the term can be either positive or negative with equal probability. (Since there are no pairwise correlations, the other terms in the expansion of

*z*

^{3}do not contribute to its average over the stimulus.) Thus, we can construct an opponent mechanism by comparing the average value of nonlinearity

*z*

^{3}when applied to triplets of checks within the glider, to its average value when applied to triplets of checks within another configuration—i.e., the glider facing the opposite direction. For further details on this calculation, see 2.

Nonlinearities | Three-element gliders | Four-element gliders | ||
---|---|---|---|---|

Even | Odd | Even | Odd | |

x | 0 | 0 | 0 | 0 |

x ^{2} | 0 | 0 | 0 | 0 |

x ^{3} | −0.44 | 0.44 | 0 | 0 |

x ^{4} | 0 | 0 | 0.26 | −0.26 |

∣x∣ | 0 | 0 | −0.25 | 0.25 |

{ | x | , x ≥ 0 0 , x < 0 | 0 | 0 | −0.18 | 0.18 |

{ x 2 , x ≥ 0 0 , x < 0 | −0.23 | 0.23 | 0 | 0 |

{ | x | 0.72 , x ≥ 0 − 0.08 | x | 0.72 , x < 0 | 0.11 | −0.11 | −0.21 | 0.21 |

*Note*: Entries indicate the normalized size of the motion signal generated by a mechanism that sums luminance within the glider, applies the indicated nonlinearity, and compares the resulting signal to a glider facing in the opposite direction. Positive values mean that the net average signal is in the centroid direction, negative values means that it is opposite to the centroid direction. Zero means that no motion signal is generated.

*u*=

*dx*/

*dt*and

*v*=

*dy*/

*dt*.

*u, v*) thus can be calculated from the gradients ∂

*I*/∂

*x,*∂

*I*/∂

*y,*and ∂

*I*/∂

*t*in several ways, for example,

*V*(

*ξ, η, τ*)—where

*ξ*is the

*x*-coordinate of the voxel (as an integer),

*η*is the

*y*-coordinate of the voxel,

*τ*is its time slice, and

*V*is 0 or 1, according to the luminance. The spatiotemporal movies we consider here are defined by a “glider rule”: whenever a set of voxels form a translation of the glider shape, then the parity of the sum of their contents is constrained to be a constant

*b*. We choose either

*b*= 0 for the “even” parity movies, or

*b*= 1 for the “odd” parity movies.

*ξ*

_{ i },

*η*

_{ i },

*τ*

_{ i };

*i*= 1, …,

*N*). Each such triplet designates the position of one voxel within the glider.

*ξ, η, τ*), it induces constraints among many sets of voxels, not just those within the original single glider. Therefore, to understand the correlation structure of a spatiotemporal movie, we need to consider the implications of repeated instances of the glider rule (Equation A1). Specifically, we determine the effect of applying the glider rule

*R*times; each application of the glider is done at a different starting point (

*ξ*′

_{ j },

*η*′

_{ j },

*τ*′

_{ j }),

*j*= 1, …,

*R*. These iterated applications lead to

*ξ*′

_{ j },

*η*′

_{ j },

*τ*′

_{ j }) can be positive or negative. (For

*R*= 1 and a single starting point at (0, 0, 0), the above equation reduces to Equation A1.)

*i*and

*j,*some combinations (

*ξ*

_{ i }+

*ξ*′

_{ j },

*η*

_{ i }+

*η*′

_{ j },

*τ*

_{ i }+

*τ*′

_{ j }) appear multiple times. If they occur an even number of times, they cancel—since the sum is interpreted mod 2. Thus, repeated application of a glider at the

*R*locations (

*ξ*′

_{ j },

*η*′

_{ j },

*τ*′

_{ j }) can result in a relationship (Equation 3) in which there is cancellation among many of the

*RN*terms in the summation. This will in turn result in a parity constraint among far fewer than

*RN*voxels. We need to characterize these possible cancellations to understand whether repeated application of the glider rule can ever result in correlations involving only a small number of voxels.

*G*with

*N*elements occupying the voxels at integer coordinates (

*ξ*

_{ i },

*η*

_{ i },

*τ*

_{ i };

*i*= 1, …,

*N*), its generating function is defined by

*x, t*) plane (Figure 4A) has a generating function

*G*

_{triangle}= 1 +

*x*+

*xt*. A glider that forms a pyramid (Figure 4C) has a generating function

*G*

_{pyramid}= 1 +

*xy*+

*y*+

*xt*. A glider that has two adjacent voxels at

*t*= 0 along the

*y*-axis that translates along the

*x*-axis at

*t*= 1 has a generating function

*G*

_{snf}= 1 +

*y*+

*xt*+

*xyt,*so-called because it generates a “standard” non-Fourier movie (column 5 of Figure 6B).

*GS*corresponds to the voxels that are constrained by the iterated glider rule (Equation A2). This is formalized by Gilbert's Theorem 2: for configurations

*T*, the total parity of their contents is constrained by the glider

*G*if and only if the generating-function relationship

*S*. Gilbert also showed that if there is no such constraint, then both parities are equally likely. He proved these results for 2-dimensional colorings, but they generalize immediately to colorings of any dimension.

*G*for which there are no pairwise spatiotemporal correlations between pairs of horizontal luminance edges, pairs of vertical luminance edges, or pairs of temporal edges (flicker). Other than the labels associated with the coordinates, these three cases are identical—so we focus on the case of edges formed by pairs of adjacent voxels alone the

*x*-axis. We will show that the crucial condition is that the generating function of

*G*is prime, in the sense defined below.

*T*. A double-domino configuration is a configuration of four voxels, arranged in two parallel pairs of adjacent voxels. We need to show that the total parity of

*T*is unconstrained by the glider. This will imply that the total within one domino of

*T*is independent of the total parity in the other domino of

*T*. Since the total parity within one domino indicates the presence of an edge (1) or the absence of an edge (0), this will show that the co-occurrences of edges are uncorrelated. As mentioned above, we assume that the dominos are parallel to the

*x*-axis.

*T,*we position the first domino of

*T*at coordinates (0, 0, 0) and (1, 0, 0), and the second domino at coordinates (

*ξ, η, τ*) and (

*ξ*+ 1,

*η, τ*). The generating function of this four-voxel set is

*T*thus can be factored into two polynomials. The first factor, 1 +

*x,*is the generating function for a domino along the

*x*-axis. The second factor,

*T*′ = (1 +

*x*

^{ ξ }

*y*

^{ η }

*t*

^{ τ }), is the generating function of a 2-voxel configuration that expresses the relative position of the two dominos of

*T*.

*GS*=

*T,*i.e., only if

*S*.

*x*) is a prime, but (1 +

*x*

^{3}) is not, since (1 +

*x*

^{3}) = (1 −

*x*+

*x*

^{2})(1 +

*x*).

*G*

_{triangle}= 1 +

*x*+

*xt*and

*G*

_{pyramid}= 1 +

*xy*+

*y*+

*xt*. However, the generator for standard non-Fourier motion,

*G*

_{snf}= 1 +

*y*+

*xt*+

*xyt*= (1 +

*y*) (1 +

*xt*), is not a prime. This factorization is the algebraic correlate of the fact that in the standard non-Fourier stimulus, an edge—represented by the term 1 +

*y*—is correlated across space and time, represented by the term 1 +

*xt*. In other words, if a glider's generating function can be factored, then it corresponds to a displaced pair of parallel edges. However, if it is prime—as it is for the stimuli we focus on—then it cannot be decomposed into a displaced pair of edges.

*G*is a prime to show that there is no iterated application of

*G*that can constrain the four voxels in a double-domino configuration

*T*. Because of the unique factorization property, the left and right sides of Equation A8 must have the same prime factors. Since (1 +

*x*) is a prime, it cannot have

*G*as a factor. Thus, the only way that the left and right sides of Equation A8 can have the same prime factors is that

*T*′ = (1 +

*x*

^{ ξ }

*y*

^{ η }

*z*

^{ τ }) must be composite and contain

*G*as a factor. That is, if Equation A8 holds, then so does

*Q*. In sum, we reduced a question about correlations within the 4-voxel set

*T,*to a question about correlations within a two-pixel set

*T*′. We now show that Equation A9 is impossible, and, consequently, that Equation A8 is impossible too—thus implying that the total parity of the voxels of

*T*must be independent.

*T*′ = (1 +

*x*

^{ ξ }

*y*

^{ η }

*z*

^{ τ }) are correlated. This configuration contains one voxel at the origin, and one at the location (

*ξ, η, τ*). The independence of these two voxels follows from Gilbert's construction of an “initial set” (Gilbert, 1980, Figure 1), in which all voxels are independent. Alternatively, one can see that these two voxels must be independent for specific gliders such as

*G*

_{triangle}= 1 +

*x*+

*xt*by observing that any product

*GQ*must contain at least three distinct terms that do not cancel: one with the highest exponent of

*x,*one with the highest exponent of

*t,*and one with the lowest total exponent.

*T*that forms a parallelogram—by replacing the factor (1 +

*x*) in Equation A8 with a factor (1 +

*x*

^{ α }

*y*

^{ β }

*z*

^{ γ }). Thus, even if local features are defined in terms of pairwise correlations across non-adjacent voxels (i.e., by (1 +

*x*

^{ α }

*y*

^{ β }

*z*

^{ γ })), such local features are themselves pairwise uncorrelated in spacetime.

*f*(

*z*) =

*z*

^{3}, to a stimulus generated by a three-element glider

*G*and the even-parity rule. We denote the luminances of the three checks within a glider by

*c*

_{1},

*c*

_{2}, and

*c*

_{3}, where black is represented by +1 and white by −1. Therefore, the coloring of the glider placed at any position and time in the stimulus can be represented by a triplet (

*c*

_{1},

*c*

_{2},

*c*

_{3}).

*G,*and of the glider facing the opposite direction, denoted

*G*′. Since the stimulus is constructed with glider

*G*and the even-parity rule, the number of black voxels in the

*G*can only be 0 or 2. So the colorings of

*G*have only 4 possibilities: (+1, +1, −1), (+1, −1, +1), (−1, +1, +1), and (−1, −1, −1). In contrast, the coloring of the

*G*′ does not have such parity constraint, so the colorings can be (+1, +1, +1), (+1, +1, −1), (+1, −1, +1), (+1, −1, −1), (−1, +1, +1), (−1, +1, −1), (−1, −1, +1), and (−1, −1, −1), a total of 8 possibilities.

*f*applied. Then, we compare the average of this signal for

*G*and

*G*′. In each case, the allowed colorings are all equally likely (see 1 and Gilbert, 1980), so they contribute equally to the average. The process is detailed in Table B1.

Glider | Coloring (c _{1}, c _{2}, c _{3}) | Sum z = c _{1} + c _{2} + c _{3} | Nonlinearity f(z) = z ^{3} | Average |
---|---|---|---|---|

G | (+1, +1, −1) | 1 | 1 | −6 |

(+1, −1, +1) | 1 | 1 | ||

(−1, +1, +1) | 1 | 1 | ||

(−1, −1, −1) | −3 | −27 | ||

G′ | (+1, +1, +1) | 3 | 27 | 0 |

(+1, +1, −1) | 1 | 1 | ||

(+1, −1, +1) | 1 | 1 | ||

(+1, −1, −1) | −1 | −1 | ||

(−1, +1, +1) | 1 | 1 | ||

(−1, +1, −1) | −1 | −1 | ||

(−1, −1, +1) | −1 | −1 | ||

(−1, −1, −1) | −3 | −27 |

*G*generates a signal of −6, and its mirror

*G*′ generates a signal of 0. That is, for these particular nonlinearity and stimulus, this opponent mechanism results in a negative motion signal

*m*(

*G*) =

*G*−

*G*′ of (−6) − 0 = −6.

*m*(

*G*). That is, we divide the motion signal

*m*(

*G*) by the root-mean-squared value that the nonlinearity would produce when placed on a random binary movie. The results in Table 1 are generated by this method.