**Understanding the depth ordering of surfaces in the natural world is one of the most fundamental operations of the primate visual system. Surfaces that undergo accretion or deletion (AD) of texture are always perceived to behind an adjacent surface. An updated ForMotionOcclusion model (Barnes & Mingolla, 2013) includes two streams for computing motion signals and boundary signals. The two streams generate depth percepts such that AD signals together with boundary signals generate a farther depth on the occluded side of the boundary. The model fits the classical data (Kaplan, 1969) as well as the observation that moving surfaces tend to appear closer in depth (Royden, Baker, & Allman, 1988), for both binary and grayscale stimuli. The recent “Moonwalk illusion” described by Kromrey, Bart, and Hegdé (2011) upends the classical view that the surface undergoing AD always becomes the background. Here the surface that undergoes AD appears to be in front of the surrounding surface—a result of the random flickering noise in the surround. As an additional challenge, we developed an AD display with dynamic depth ordering. A new texture version of the Michotte rabbit hole phenomenon (Michotte, Thinès, & Crabbé, 1964/1991) generates depth that changes in part of the display area. Because the ForMotionOcclusion model separates the computation of boundaries from the computation of AD signals, it is able to explain the counterintuitive Moonwalk stimulus. We show simulations that explain the workings of the model and how the model explains the Moonwalk and textured Michotte phenomena.**

**Figure 1**

**Figure 1**

**Figure 2**

**Figure 2**

*dynamic*depth order structure. This stimulus is a texture version of the Michotte rabbit hole effect (Michotte, Thinès, & Crabbé, 1964/1991). The standard Michotte rabbit hole effect is illustrated in Figure 3. As the black circle moves back and forth, it is occluded on the right side by a rectangle the same color as the background, and it appears to slide in and out of a slit formed out of the background. The slit is always only as wide as it needs to be—just slightly longer than the chord of the circle where it is occluded—and when the circle is completely unoccluded, the slit disappears entirely.

**Figure 3**

**Figure 3**

*and*in front of it.

*consistent*with a sequence of luminance changes is propagated uninhibited and detected.

*changes in elementary motion*. It can thus be viewed as a second-order Barlow–Levick model, where the starting and stopping of motion is detected instead of elementary motion. The key mechanism again is the inhibition of signals that are “incorrect.” In the case of accretion signals, the existence of previous motion inhibits the propagation. And in the case of deletion signals, any continuation of motion in the same direction again inhibits the propagation of the deletion signal.

**Figure 4**

**Figure 4**

*seen*) of motion, and depth.

**Figure 5**

**Figure 5**

**Figure 6**

**Figure 6**

*d,*” we get “all locations in this circle centered on

*X*, in direction

*d*.”

**Figure 7**

**Figure 7**

*A*,

*B*,

*C*, and

*K*are constants and

*X*and

*I*are, respectively, the excitatory and inhibitory inputs. The values of

*X*and

*I*are generally unique to each equation. Equation 1 can also be solved at equilibrium, yielding with the same meanings for the symbols. In most of the equations of the model, the constants

*A*,

*B*,

*C*, and

*K*are the same, which means that only the name of the layer

*w*and the excitatory and inhibitory inputs are actually interesting and need to be specified. The layer (in this case

*w*) is a spatially distributed collection of nodes whose values change over time. Thus all nodes can be indexed by spatial position

*ij*as well as time

*t*; as these are always present, they will not be used unless necessary, and thus

*w*(

*k*,

*w*(

*t*,

*ij*;

*k*,

*k*refers to orientations (of edges) and

*on*and

*off*systems (“±” refers to both together). Depending on the specific layer of the model, a subset of these dimensions will apply. The orientation

*k*and direction

*w*represent fields of nodes but should be thought of as multidimensional arrays, and are therefore denoted by a bold symbol

**w**. Subscripts denoting a specific position within the array

*ij*are not needed (because all the nodes within a layer perform the same computation) and therefore are not used. If the spatial position or orientation is required to clarify a specific computation, then the only the necessary variables are specified; for example, refers to the value of

*w*at the previous time unit in the past, but also one spatial unit to the right. Subscripts, when used, mostly denote identification of a constant or kernel with a specific layer. In this way, the description remains valid whether simulations are implemented using a rectilinear, hexagonal, or spatially variant sampling regime.

*necessary*to use the faster equilibrium computations whenever possible.

**M**of finite size that is applied to the layer

**w**at every location, yielding an output value at each location applied. The simplified notation is here expanded into the more conventional notation Most kernels are

*isotropic*, the same in all directions, and are denoted by

**M**; common forms of these kernels are multidimensional Gaussian kernels, such as this two-dimensional Gaussian kernel: Other kernels are

*anisotropic*, such as edge detectors, and are denoted by

**N**. Gabor-type kernels may also be used, and are denoted by

**G**.

*H*is the neighborhood of location

*ij*(implicit in the definition of

*F*), which in a rectangular simulation would correspond to

*H*= {(

*i*+ 1,

*j*), (

*i*− 1,

*j*), (

*i*,

*j*+ 1), (

*i*,

*j*− 1)}; (

*δ*and ε are constants). These equations are rewritten as

**w**is the sum of two Gabor components convolved with an input layer

**i**;

**G**is a Gabor kernel of orientation

*k*with spread

*σ*

_{g}; and both the even (phase =

*π*/2) and odd (phase = 0) parts are used in the computation.

*L*, varying as a function of space and time—i.e.,

*L*(

*t*,

*i*,

*j*). Luminance increases (

*on*channel) are defined by and luminance decreases (

*off*channel) are defined by Remember that all outputs are half-wave rectified ([

*x*]

^{+}) unless otherwise specified. Inhibitory interneurons, implementing Barlow–Levick inhibition, are governed by where

*ij*+

*ij*in the direction of

*a*() and

*b*() are very similar, but remember that

*a*() represents known interneurons that inhibit each other, as well as

*b*() neurons, whereas

*b*() represents excitatory neurons that represent motion in a specific direction. Figure 8 explains the geometrical arrangement of these motion neurons. The signals are combined by adding the rectified

*on*and

*off*components to yield elementary motion signals: Short-range filtering is performed according to: where

*σ*

_{1},

*σ*

_{2}, and

*σ*

_{3}are the sizes of Gaussian kernels and

*F*

_{1}and

*F*

_{2}are constants. The kernel defined by

*σ*

_{1}is the small isotropic on-center for the same direction and also off-center for

*other*directions; whereas

*σ*

_{2}and

*σ*

_{3}are the anisotropic off-surround for the same direction (as shown in Figure 8C). This signal is then smoothed using a direction-specific uniform-density circular kernel with radius

*σ*

_{e}according to

**Figure 8**

**Figure 8**

*stationary motion*needed to be introduced—i.e., in addition to the different directions of motion a no-motion “direction” was used. This concept has been eliminated entirely from the version of the model presented here. Nothing is lost with this simplification; indeed, the equations are both simpler

*and*more general as a result.

*g*and

_{A}*g*). If the edge is moving, then the change will occur at the same location (

_{B}*g*and

_{C}*g*). There are four different conditions, corresponding to deletion at a static edge (A), accretion at a static edge (B), deletion by a moving edge (C), and accretion by a moving edge (D), In all cases, the inhibitory part of the shunting equations is a convolution with a “short but wide” kernel with length

_{D}*σ*

_{4}and width

*σ*

_{5}in order to normalize sensitivity based on lateral neighboring locations. These accretion and deletion signals pair up to generate the push–pull signals. First, the two signals that produce a push

*ahead*of the signal are combined into one, then the two signals that produce a push behind the signal are combined,

**Figure 9**

**Figure 9**

*k*() and

*l*() motion onset and offset signals of Barnes and Mingolla's (2013) equation 40 are replaced by the motion edge signal

*m*() defined in our Equation 30. Bipole cells are used to connect edge fragments into longer-range boundaries, which are then used to compute both the motion and depth perceptual systems.

*X*), inhibitory (

*I*), and gating (

*Z*) terms.

*a*() (Equation 16),

*d*() (Equation 19),

*ψ*(

*ξ*() (Equation 32). Table 1 lists the relevant parameters used in the simulation (the parameters for the FORM system are not included). Most are the same as those used by Barnes and Mingolla (2013). While we have not done a formal dimensional analysis, we mention in the table when the ratios of some parameters seem to matter.

**Table 1**

- 1) Eliminating the smoothing of motion discontinuities (i.e., Barnes & Mingolla, 2013, equations 20–23). The elimination cased noisier signals, but they were no longer completely smoothed away, which is important because the motion signals from grayscale imagery are weaker.

- 2) Modifying all local processing so that instead of computing the maximum or minimum of the whole simulated field, it applies to a local area only. This change was essential to be able to handle the effects of a flickering field of dots, which generates spurious accretion and deletion events simultaneously.

- 3) Modifying the filtering of elementary motion signals so that the surround inhibition filters are elongated in the direction of motion. This was necessary in order to properly detect the smaller and finer features of the moving circle.
- 4) Localizing depth-order processing by creating push–pull signals that get filled in where appropriate.
- 5) Removing the need for “static” motion signals. This created better localization of moving edges and also simplified the computations needed.
- 6) Including a short-but-wide off-surround for the motion discontinuity detectors.

**Figure 10**

**Figure 10**

**Figure 11**

**Figure 11**

**Figure 12**

**Figure 12**

**Figure 13**

**Figure 13**

**Figure 14**

**Figure 14**

**Figure 15**

**Figure 15**

**Figure 16**

**Figure 16**

*with*the background. It is not possible to determine with certainty whether the background moves or not in the standard version; but when the texture is present, there is certainty. However, because there is no texture, the FMO model cannot explain the standard version. In fact, any nontextured motion display is problematic. Some possible ways to deal with this issue would probably need to recognize the whole circle as an object moving coherently, the general conservation of object area, and possibly tracking of objects even when occluded (Marshall, Alley, & Hubbard, 1996).

*motion signals*, not the accretion or deletion of

*texture elements*. In the case of realistic imagery, this would seem to be a rather pedantic distinction of no real practical concern, but it does provide a possible way to experimentally test this aspect of the model. It seems very unusual, probably impossible ecologically, to find a situation where surfaces generate the same motion signals while otherwise being indistinguishable. However, if it is possible to create such a situation, then it might be informative about what types of representations are used by the visual system to represent motion and to detect differences in motion.

*not the same sheet*. In other words, the statistics of the random dots and the motion are both identical. We call this the “Great Roe stimulus,” a reference to the Borgesian chimeric creature described by Allen: “The Great Roe. A mythological beast with the head of a lion and the body of a lion, though not the same lion” (1974, p. 20). While it is not clear whether the mythical beast has a clearly visible boundary between the head and the body, observers still see in our stimulus a clear boundary between the moving sheets of random dots. However, the boundary does feel “diminished” somehow compared to the classical accretion and deletion stimuli with at least one static surface. Our initial conclusions are that the motion signals are indeed important, but that they are not the only cue to seeing the boundary between the moving sheets. We are continuing to investigate this phenomenon.

*The Journal of Physiology*, 178 (3), 477–504.

*Neural Networks*, 37, 141–164.

*Proceedings of the third international conference on computer vision*(ICCV-90), Osaka, Japan ( pp. 33–37). IEEE Computer Society Press.

*Figure-ground organization and perceptual grouping*. Retrieved from http://socrates.berkeley.edu/~plab/earlygroup/figureGroundGrouping.htm

*Proceedings of the Fifth International Conference on Computer Vision*( pp. 1050–1057). IEEE.

*IEEE Transactions on Pattern Analysis and Machine Intelligence*, 30 (7), 1171–1185.

*IEEE computer society conference on computer vision and pattern recognition*( pp. 274–281). IEEE.

*Perception & Psychophysics*, 6 (4), 193–198.

*Neural Networks*, 18 (10), 1319–1331.

*PLoS ONE*, 6 (6), e20951.

*Journal of Neurophysiology*, 71 (5), 1597–1626.

*Advances in neural information processing systems*, Vol. 8 ( 816–822). Cambridge, MA: MIT Press.

*Michotte's experimental phenomenology of perception*(pp. 140–167). Hillsdale, NJ: Erlbaum. (Reprinted from

*Les compléments amodaux des structures perceptives*, by Michotte, A. Thinès G.& Crabbé G. 1964, Louvain, Belgium: Publications Universitaires de Louvain)

*Proceedings of the fifth international conference on computer vision*( pp. 1004–1049). IEEE.

*Perception*, 17 (2), 255–266.

*Behavioral and Brain Sciences*, 21, 723–748.

*Neural Computation*, 23 (11), 2868–2914.

*Perception*, 17 , 289–296.

*2011 IEEE conference on computer vision and pattern recognition*( pp. 2233–2240). IEEE.

*Journal of Vision*, 12 (9): 242, doi:10.1167/12.9.242. [Abstract]

*Computer Vision and Image Understanding*, 100 (1), 3–40.

*Proceedings of the National Academy of Sciences, USA*, 111, E5214–E5223.

- Movie 11. The Great Roe stimulus.