Free
Research Article  |   June 2008
Superposition catastrophe and form–motion binding
Author Affiliations
Journal of Vision June 2008, Vol.8, 13. doi:https://doi.org/10.1167/8.8.13
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Jean Lorenceau, Christophe Lalanne; Superposition catastrophe and form–motion binding. Journal of Vision 2008;8(8):13. https://doi.org/10.1167/8.8.13.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Numerous studies indicate that perceiving global object motion results from the integration of local component motions across space and time. Less attention has been paid to the issue of motion selection, necessary to avoid spurious associations of component motions belonging to different objects and to solve the so-called “superposition catastrophe problem” (F. Rosenblatt, 1961). We address this issue using outlines of geometrical shapes moving behind apertures that concealed their vertices such that recovering their global motion requires the selection and integration of some, but not all, component motions. Depending on which local motions are selected for motion integration, these stimuli yield the perception of either expansion/contraction, of global translation, or of segments moving independently. We show that the selection process depends on local and global stimulus parameters, including the local direction of figure's line-endings or the spatial configuration of component motions. In contrast, motion selection depends less on the width—i.e., spatial frequency content—or polarity of the edges. Finally, synchronous temporal modulation of component motions in the gamma range has little effect on motion selection. These results indicate that selecting component motions for motion integration is primarily determined by form constraints. As a consequence, current models assuming that mutually consistent component motions are bounded in a velocity space-lacking spatial organization should be revised to account for the present data. Alternately, interactions between visual areas selectively processing form and motion could be introduced in order to account for the perceptual binding of moving objects.

Introduction
Psychophysical, physiological, and modeling studies converge to indicate that object motion is recovered by integrating motion signals analyzed across space and time by direction-selective sensors with spatially limited receptive fields (Adelson & Movshon, 1982; Fenema & Thompson, 1979; Grossberg, Mingolla, & Viswanathan, 2001; Movshon, Adelson, Gizzi, & Newsome, 1986; Simoncelli & Heeger, 1998). Although different stimuli, such as plaids, aperture stimuli, or random dot kinematograms, have been used to assess the characteristics of the motion integration process (Lorenceau & Shiffrar, 1992; Mingolla, Todd, & Norman, 1992; Rubin & Hochstein, 1993; Stoner & Albright, 1992; Watamaniuk & Sekuler, 1992; Williams & Sekuler, 1984; Wilson & Kim, 1994), these studies have mostly focused on the conditions that yield the perception of global coherent motion or on the contrary on the perception of local—or transparent—component motion, namely the conditions under which motion integration or segmentation occurs (Mingolla et al., 1992; Rubin & Hochstein, 1993; for reviews, see Lorenceau & Shiffrar, 1999; Stoner & Albright, 1994). A consensual view is that integration and segmentation are the two faces of the same coin. Thus, either component motions are merged into an integrated percept or are segregated and considered independent motions. In these studies, the motion of 2D features proved of primarily importance for motion integration, as their salience, reliability, and status gate the mere possibility of combining component motions distributed across space into a whole. Consequently, it was proposed that motion integration proceeds in two stages: 1D local motion would first be extracted, irrelevant features, such as T-junctions due to occlusion, would be discarded, and mutually consistent component motions would finally be integrated into a single moving object at a second stage. Computational models based on this two stage scheme have successfully accounted for much of existing data (Grossberg et al., 2001; Koechlin, Anton, & Burnod 1999; Líden & Pack, 1999; Nowlan & Sejnowski, 1994; Simoncelli & Heeger 1998), although accounting for the influence of form and spatial context (Lorenceau & Alais, 2001; McDermott, Weiss, & Adelson, 2001) remains a difficult challenge for these models (but see Weiss & Adelson, 1996). 
However, in everyday life, we are often confronted with visual scenes containing several objects moving independently in different directions. Recovering their motions requires both the segregation of local motion measurements related to each object from those belonging to different objects and the integration of these separate collections of component motions for recovering each object motion. The problem of selection, that is, the process which “decides” which and when local motions must be bound together, has not been thoroughly studied, maybe because the stimuli at hand were not well suited to address this issue. Indeed, in most studies with aperture stimuli or plaids, the perception of component motions is contrasted with that of a single coherent “object” motion. With rare exceptions (Lorenceau & Boucart, 1995; Mingolla et al., 1992), only two component motions and a single moving “object” are used in these studies. This may not bring insights into how the selection process operates, as these studies do not contrast the perception of several “objects” each requiring the partial integration of component motions. Therefore, studies using these classes of stimuli tackle the issue of motion integration versus motion segmentation rather than address the issue of motion selection. 
Consider instead the stimulus in Figure 1, which represents two overlapping outlines of geometrical shapes moving in different directions. If these stimuli are seen behind apertures such that only straight segments are visible, the activity elicited in orientation- and direction-selective cells in area V1—but this would also be true for any sensor processing motion locally—might resemble the schema presented in the bottom of Figure 1, in which a population of V1 neurons face the aperture problem, i.e., each neuron responds to the component motion orthogonal to its preferred orientation. The question, then, is not only to determine whether the responses of these cells—or sensors—should be bound together or not, but also to determine which responses should be bound while avoiding spurious combinations of contours belonging to a different object. Paradoxically, although this difficulty, referred to as the “catastrophe superposition” problem (Nowlan & Sejnovsky, 1995; Rosenblatt, 1961; von der Malsburg, 1981, 1999; von der Malsburg & Schneider, 1986; Weiss & Adelson, 1996), has received much attention from a modeling perspective, little experimental work has been conducted to explore how biological visual systems solve this problem. 
Figure 1
 
(Top) The problem of motion selection illustrated with two geometrical shapes moving in different directions behind apertures. (Bottom) Schematic representation of the activity evoked in primary visual cortex by component motions. Recovering global object motion requires that the visual system combines component motions from the same shape and avoids spurious combinations with component motions from a different shape, i.e., solve the superposition catastrophe problem.
Figure 1
 
(Top) The problem of motion selection illustrated with two geometrical shapes moving in different directions behind apertures. (Bottom) Schematic representation of the activity evoked in primary visual cortex by component motions. Recovering global object motion requires that the visual system combines component motions from the same shape and avoids spurious combinations with component motions from a different shape, i.e., solve the superposition catastrophe problem.
Models designed to recover objects' form from the local analysis of their contours often fail because of the lack of information regarding their ownership and relationships. Solving this problem calls for additional constraints (see e.g., Qiu, Sugihara, & von der Heydt, 2007). For motion processing, using a rigidity constraint and determining which component motions are mutually consistent may help in parsing the image into distinct perceptual entities. However, several solutions might exist. As a matter of fact, observers asked how many objects are shown in Figure 1 and in which directions they move have difficulties in answering when all segments have the same colors. The percept is that of a single non-rigid flow grossly moving in the average component direction. This suggests that observers have trouble selecting component motions belonging to the same rigid object on the sole basis of the mutual consistency of their component direction and speed and to segment these motions from the remaining inconsistent moving contours. Tagging the component motions of each object—e.g., with green and red colors—helps to resolve the superposition catastrophe problem, with the result that both objects are correctly segmented and their global motion recovered. This suggests that other constraints—or prior assumptions—can be used to solve this binding problem. Among them, the constraints related to form information may play an important role, as it has recently been shown that form is indeed a powerful attribute that influences the ability to bind local motions into a whole moving object (Lorenceau & Alais, 2001; Lorenceau & Zago, 1999; McDermott et al., 2001). 
In this study, we designed a new stimulus—two moving diamonds partially visible through apertures that concealed their vertices at all times—to uncover the constraints involved in motion selection. One reason to use this two-diamond stimulus is that it is inherently ambiguous, such that the different percepts it may elicit can bring insights into the motion selection process at work. When static, this stimulus can be seen either as a small diamond embedded in a large one or as two overlapping diamonds of the same size ( Figure 2). When asked to describe this static stimulus, observers spontaneously favor the first interpretation, corresponding to two diamonds of different sizes. This is expected because proximity, good continuation, and closure, known to be powerful cues for grouping static contours, favor this interpretation as stated earlier by the Gestaltists (Koffka, 1935; see also Kovacs & Julesz, 1993). 
Figure 2
 
Stimuli used in the experiment. A static view of two diamonds seen behind apertures (top) can be interpreted in different ways (bottom): overlapping diamonds of same size (a, c) or a small diamond surrounded by a large one (b). Observers generally favor this latter interpretation, as expected if good continuation, proximity, and closure are used to group the line segments into a whole.
Figure 2
 
Stimuli used in the experiment. A static view of two diamonds seen behind apertures (top) can be interpreted in different ways (bottom): overlapping diamonds of same size (a, c) or a small diamond surrounded by a large one (b). Observers generally favor this latter interpretation, as expected if good continuation, proximity, and closure are used to group the line segments into a whole.
Figure 3 shows the local motion vectors available within two crossed rectangular apertures when two diamonds with different or equal size oscillate out of phase in opposite directions. Depending on which local component motions are selected, several percepts can emerge. For instance, when the display corresponds to two diamonds of equal size ( Figure 3, bottom), binding together the 4 component motions in the center on the one hand and the 4 outer component motions on the other should elicit the percept of a small expanding diamond and a large contracting diamond; alternately, grouping the component motions that belong to each diamond of the same size (e.g., two inner and two outer segments) would yield the percept of two identical diamonds translating in opposite directions. Yet, these two interpretations are equally plausible: common fate that may be used as a rule to group component motions exists for the two interpretations. The same combination scheme applied to diamonds of different size ( Figure 3, top) would yield opposite perceptual outcomes. 
Figure 3
 
When two diamonds of different (top) or equal (bottom) size seen behind crossed apertures oscillate out of phase in opposite directions (large arrows), the local component motions (thin arrows) can be combined in different ways, resulting either in the perception of two translating diamonds or of two expanding/contracting diamonds.
Figure 3
 
When two diamonds of different (top) or equal (bottom) size seen behind crossed apertures oscillate out of phase in opposite directions (large arrows), the local component motions (thin arrows) can be combined in different ways, resulting either in the perception of two translating diamonds or of two expanding/contracting diamonds.
Note that models of motion integration based on averaging local component motions predict a stationary percept as neighboring segments moving in opposite directions with the same speed would cancel each other. Alternately, all component motions could be segregated, such that each contour would appear to move independently with no global coherent percept, in which case no rigid moving diamonds would be seen at all. We performed a series of experiments with this class of stimuli to determine which of these different perceptual interpretations dominates and to examine the parameters on which they might depend. Informal observations with these two displays indicated that observers favor the perception of expanding/contracting diamonds of unequal size or that of two translating diamonds of the same size rather than see incoherent motion, although these percepts were somewhat multistable with episodes of perceived incoherent motion. Among these different interpretations of these ambiguous stimuli, two—expansion/contraction versus translations—clearly dominate. We therefore used a 2-alternative-forced-choice (2AFC) paradigm to uncover the parameters that favor the selection of one or the other percept. 
Experiment 1: Effect of aperture orientation
In this first experiment, we use two overlapping diamonds of equal size and measure the proportion of perceived expansion/contraction versus left/right translation as a function of the orientation of the apertures relative to the component line-segments. With this stimulus, the motion of line-ends at aperture borders is congruent with the local component motions when the orientations of the apertures are 45° and 135° but incongruent for different aperture orientations. 
It is thus possible to test whether motion selection is constrained by the perceptual organization of the stimulus when static—a small diamond embedded in a large one—or whether the motion of line-endings at aperture borders, previously found to strongly modulate motion integration (Lorenceau & Shiffrar, 1992), overcomes the global form constraint. To compare the dynamics of the contribution of line-ends to motion binding with previous results (Lorenceau, 1996; Lorenceau, Shiffrar, Wells, & Castet, 1993; Shiffrar & Lorenceau, 1996; Yo & Wilson, 1992), two durations of motion were used (300 and 600 ms) and apertures could be visible or not (see below), such that line-ends could be classified as extrinsic—visible apertures—or intrinsic—invisible apertures (Shimojo, Silverman, & Nakayama, 1989). 
Methods and stimuli
All stimuli were displayed on a 20-in. Sony monitor (GDM 1950, 1280 × 1024 × 8 bits) refreshed at 60 Hz. The stimuli were simple white (109 cd/m 2) line drawings of 2 overlapping diamonds (side 4 degrees of visual angle, dva thereafter) presented against a gray background (12.08 cd/m 2) behind two black or gray rectangular windows—or apertures (width 1 dva, length 4 dva)—such that the diamonds' vertices were hidden ( Figure 3). The center to center distance between the diamonds was 0.9 dva. The apertures were invisible when gray and of the same hue and luminance as the background. In turn, the status of the line-ends at aperture borders was changed from extrinsic to intrinsic (Shimojo et al., 1989). Five aperture orientations were used (0°, 20°, 45°, 70°, and 90°). The orientations (45° and 135°) and width (1/60 dva) of the diamonds' edges were kept the same throughout the experiment. Therefore, depending on aperture orientation, visible segment length varies from 1 dva (oblique apertures) to 1.41 dva (vertical and horizontal apertures), as shown in Figure 4
Figure 4
 
Top: Proportion of perceived expansion, averaged across six observers, plotted as a function of aperture orientation for visible and invisible apertures at two durations of motion. Bottom: Standard deviations across observers as a function of aperture orientation. See text for details.
Figure 4
 
Top: Proportion of perceived expansion, averaged across six observers, plotted as a function of aperture orientation for visible and invisible apertures at two durations of motion. Bottom: Standard deviations across observers as a function of aperture orientation. See text for details.
A 2AFC procedure was used to measure the dominance of perceived expansion/contraction versus left/right translation. The two diamonds translated sinusoidally out of phase along a horizontal path (amplitude = 0.56 dva, frequency = 0.83 Hz) around a fixation point placed in the center of the screen. Motion parameters were chosen such that the vertices of each diamond always remain invisible. At the end of each trial, observers were asked to press one of two keys to indicate whether the diamonds appeared to translate in opposite directions or to contract and expand. Note that to perform the task, it is necessary to selectively integrate segment motions across space and time. Each stimulus was presented 20 times within a block. Two durations of motion (300 and 600 ms) were tested in different blocks. Five students in the department of Psychology participated in the experiment. They were seated 114 cm away from the screen with their head maintained in a chin rest. All had normal or corrected-to-normal vision. 
The movie example below shows some examples of the stimuli used in the experiment ( Movie 1). 
 
Movie 1
 
Display showing the different percepts evoked by different aperture orientations. In this movie, three orientations of the visible apertures are shown. In the experiment, a set of five possible orientations aperture was selected and each orientation was used in different trials.
Results
The percentage of perceived expansion/contraction, averaged across observers, is plotted as a function of aperture orientation for two durations in Figure 4, top. The associated standard deviations are plotted in Figure 4, bottom. Perceived expansion dominates largely when aperture orientation is 45°, i.e., when segment and line-ends motion are both congruent with the perception of expanding and contracting diamonds. 
For horizontal and vertical apertures, the proportion of perceived expansion/contraction reaches a minimum. As the orientation of apertures progressively differ from oblique, i.e., when the difference between line-ends and segment direction increases, the percentage of perceived expansion decreases. This effect is larger for invisible as compared to visible apertures at a short duration (300 ms) and is large and similar for both conditions at a long duration (600 ms). For this latter duration, perceived expansion is rare when apertures are horizontal and vertical, with a small advantage for horizontal apertures. 
Examining the standard deviations across observers ( Figure 4, bottom) brings additional insights into the process at work: standard deviations are large for horizontal and vertical apertures at a short duration (300 ms) but decrease at a longer duration, indicating that the perception of two translating diamonds is more reliable across observers. For intermediate aperture orientations, standard deviations are large at all durations, indicating large inter-individual differences in the perception of these ambiguous stimuli. 
Because the variances of the responses do not meet the usual hypothesis for a classical ANOVA, we performed a statistical analysis using a mixed-model for which including the 3 fixed effects and interactions between time and aperture visibility together with angles provides a good fit to the data. 
Altogether, these results indicate that perceiving expansion/contraction is largely dependent upon aperture orientation, i.e., dependent upon the direction of moving line-ends, rather than constrained exclusively by global form attributes. Note that the “non-ecological” perception of intermingled or superimposed diamonds of equal size does not seem to prevent the perception of left/right translation with vertical or horizontal apertures. We mentioned in the introduction that episodes of incoherent motion could also be seen with these stimuli. Although we did not attempt to characterize the frequency and the duration of these episodes, our finding that observers reports are close to 100% or 0% for some conditions (oblique, horizontal, and vertical apertures) indicates that these episodes were rare and occur mostly during long lasting presentations. The effect of the short durations used herein suggests that the contribution of line-ends to motion parsing is a dynamic process that develops slowly over time. This is compatible with previous observations with translating lines or a rotating diamond (Lorenceau et al., 1993; Shiffrar & Lorenceau, 1996) and further supports the view that line-ends are processed more slowly than segment motion (see also Lorenceau, Giersch, & Series, 2005; Pack & Born, 2001; Yu & Levi, 1999). The effect of aperture visibility, which modifies the salience of line-ends both by increasing their contrast and by changing their status—from extrinsic to intrinsic (Shimojo et al., 1989)—also supports the proposition that line-ends' motion modulates motion selection. This latter finding appears at odds with previous proposals that extrinsic line-ends, forming T-junctions at aperture borders are irrelevant for motion integration and are thus discarded before integration of mutually consistent component motions is performed (Líden & Pack, 1999; Shimojo et al., 1989). This assumption may be too strong, however, as the present results suggest that extrinsic moving line-ends do contribute to motion selection (also see Duncan, Albright, & Stoner, 1994). 
Experiment 2: Selection by size
It is well known that the visual system analyses images at different spatial scales through different spatial frequency channels (Blakemore & Campbell, 1969; Campbell & Robson, 1968; Watson, 1982; Wilson & Bergen, 1979). Similarly, contrast polarity is processed through different—ON and OFF—channels. To evaluate the extent to which motion selection relies on spatial scale and contrast polarity, we performed the same experiment as before, using two diamonds of equal size, but with different segment widths and polarity (Figure 5). Although bars with sharp edges contain energy in different spatial frequency bands, we assumed that different bar widths would shift the energy spectrum of these stimuli toward higher or lower spatial frequencies, such that the similarities or dissimilarities in the averaged spatial frequency content would nonetheless allow to test whether motion selection operates at different spatial scales. Several combinations of segment widths and contrast polarity were chosen such that binding segments within a similar spatial scale—i.e., grouping segments of identical width—would counteract the preference for perceiving expanding/contracting diamonds as found in Experiment 1
Figure 5
 
Stimuli used in Experiment 2 with line segments of varying widths and polarity. In the experiment, the diamonds translated sinusoidally in opposite phase. Observers were asked whether they perceive two translating diamond (of the same size) or two expanding/contracting diamonds (of different sizes).
Figure 5
 
Stimuli used in Experiment 2 with line segments of varying widths and polarity. In the experiment, the diamonds translated sinusoidally in opposite phase. Observers were asked whether they perceive two translating diamond (of the same size) or two expanding/contracting diamonds (of different sizes).
Again, the diamonds oscillated back and forth in opposite directions, and observers were required to indicate whether the stimulus appears as two expanding/contracting diamonds or as two diamonds translating horizontally in opposite directions. 
Stimuli
A set of 4 combinations of 2 segment widths (0.03 and 0.11 dva) and contrast polarity was built ( Figure 5) such that grouping segments according to their width and polarity would yield the perception of two diamonds of equal overall size—but different segment widths. The setup and the motion characteristics were as before. Only oblique apertures (45° and 135°) were used together with a short duration of motion (300 ms). The percentage of perceived expansion/contraction was measured for 6 observers (20 trials per condition) in a two-alternative-forced-choice design. Five out of six observers performed in Experiment 1
Results
Figure 6 shows the percentage of perceived expansion/contraction plotted for the different combinations of segment widths and polarity. 
Figure 6
 
Proportion of perceived expansion, averaged across six observers, plotted for the stimuli shown in Figure 5. Error bars represent one standard error.
Figure 6
 
Proportion of perceived expansion, averaged across six observers, plotted for the stimuli shown in Figure 5. Error bars represent one standard error.
As it can be seen, the percentage of perceived expansion/contraction is less than in the comparable condition of Experiment 1, suggesting that grouping by polarity biases observers toward seeing more translation. Overall, segment width per se has little influence, as the percentage of perceived expansion/contraction remains high in all conditions. However, this percentage is significantly less when dissimilar segment widths are used as compared to conditions using a single width ( F(1,5) = 3.62, p < 0.05). 
Comparisons between Experiments 1 and 2 suggest that grouping by similarity and polarity does not overcome other—form—constraints used to select component motions. 
Thus, although motion selection is modulated by similarities in spatial frequencies and polarity in a subset of conditions, it is not strongly constrained by the properties of spatiotemporal frequency filters that analyze component motion at an early stage, at least for the class of stimuli used herein. 
Experiment 3: Small and large diamonds
In the previous experiments, two diamonds of equal size moving sinusoidally in opposite directions behind crossed apertures were mostly seen as a small and a large expanding and contracting diamonds. One possibility is that this effect simply reflects the “direct” compatibility of the distribution of component motions with an expanding and contracting flow (Gibson, 1979), whereas the perception of translating diamonds is computationally more demanding as it requires that component motions incompatible with the global direction—e.g., the motion of line-endings along oblique aperture borders—are “discarded,” while the ambiguous motion signals extracted by motion-selective cells within a moving segment, compatible with a global translation, are combined into a global horizontal motion. To test whether the preference for expansion/contraction found in the previous experiments is due to this difference, two diamonds of unequal sizes translating along a horizontal axis are used in Experiment 3. The diamonds' sizes were chosen such that the stimulus, when static, was identical to that used in the previous experiments. If global form constrains motion selection, that is, if the segments are grouped on the sole basis of proximity, good continuation, and closure, the perception of a small and a large diamond translating sinusoidally in opposite phase along a horizontal axis should dominate over the percept of two expanding/contracting diamonds of equal size. If, on the contrary, the perception of contraction and expansion is independent of the spatial distribution of the local motion vectors in the visual field and emerges because component motions are “directly” compatible with expansion and contraction or is simply preferred—as ecologically more meaningful—two diamonds of equal size should be seen, each oscillating in depth. 
Stimuli
The stimuli ( Figure 7) consisted of 12 combinations of 2 segment widths (0.03 and 0.11 dva) such that grouping segments according to their width would yield either the perception of two identical diamonds or two diamonds of different size. The distribution of segment width was counterbalanced to avoid perceptual biases that could favor a particular configuration. The motion characteristics were as before, except that grouping the motion of the inner segments on the one hand and the outer segments on the other would yield a percept of small and large translating diamonds. Only oblique visible or invisible apertures (45° and 135°) were used together with a short duration of motion (300 ms). The percentage of perceived expansion/contraction was measured for 6 observers (30 trials per condition). 
Figure 7
 
Stimuli used in Experiment 3. Diamonds of equal or different sizes were made of segments of similar or different widths. In each trial, the 2 diamonds translated sinusoidally out of phase along a horizontal axis.
Figure 7
 
Stimuli used in Experiment 3. Diamonds of equal or different sizes were made of segments of similar or different widths. In each trial, the 2 diamonds translated sinusoidally out of phase along a horizontal axis.
Results
The percentage of perceived expansion/contraction, averaged across observers, is plotted in Figure 8. A clear dichotomy between conditions is found, with diamonds of same size being mostly interpreted as expanding/contracting while diamonds of different size are viewed more often as translating in opposite directions. 
Figure 8
 
Proportion of perceived translation, averaged across six observers, plotted for the stimuli shown in Figure 7. Error bars represent ±1 standard error.
Figure 8
 
Proportion of perceived translation, averaged across six observers, plotted for the stimuli shown in Figure 7. Error bars represent ±1 standard error.
For these latter conditions, the distribution of widths across space significantly influenced perceived motion ( F(5,25) = 2.99; p < 0.05): When segments of the same width define two diamonds of equal sizes, observers report more often a perception of expansion/contraction. In addition, segment width has a significant effect on response times ( F(5,25) = 3.62; p < 0.05; data not shown). In particular, response times for the two diamond size conditions are significantly longer when line segments are grouped by similarity ( F(2,10) = 60.99; p < 0.05). 
These findings indicate that selecting motion components strongly depends on proximity, continuity, and closure, which are not counterbalanced by a similarity constraint. 
Experiment 4: Selection by synchrony
Among the features that may be used for perceptual binding, temporal information has recently attracted the interest of several researchers. One reason for this renewed interest comes from electrophysiological recordings in cat and monkey bringing evidence that synchronization of neuronal activity may play an important role in feature binding (Eckhorn, 2000; Eckorn et al., 1988; Gray & Singer, 1989; Singer & Gray, 1995). Synchronized activity between pairs of neurons has been observed in the gamma range (40 Hz) and was found to depend on stimulus configuration, such that synchronization of spike discharge was strongest for stimuli that met one or several of the Gestalt criteria for perceptual grouping (e.g., common fate, colinearity; for a review, see Singer & Gray, 1995; but see Palanca & DeAngelis, 2005; Thiele & Stoner, 2003). 
Based on these electrophysiological data and theoretical considerations (von der Marlsburg, 1981), several psychophysical experiments have been conducted to test whether temporal modulation of visual stimuli in the gamma range would favor perceptual grouping (Alais, Blake, & Lee, 1998; Lee & Blake, 1999; Leonards, Singer, & Fahle, 1996). Although some authors found increasing perceptual grouping under these conditions, others have disputed the logic of these experiments (Farid & Adelson, 2001) or found different results. 
We sought to study the effect of temporal modulation on motion selection using two different displays, consisting of two overlapping diamonds with different global motion. In these displays, segments consistent with diamond shapes were flickered in opposite phase at varying temporal frequencies, and observers were asked which of two conflicting percepts dominates. In a first experiment, stimuli similar to those used in Experiment 1 were used, such that grouping segments based on the imposed temporal modulation would yield a percept of two translating diamonds of the same size, in conflict with the spontaneous dominant perception of small and large expanding/contracting diamonds found in the previous experiments. Therefore, these stimuli allow us to test whether grouping based on temporal modulation may counteract grouping based on global form constraints (see Experiment 1). 
A second display was designed to avoid a strong contribution of form constraints that could already bias motion interpretation ( Figure 10). To that aim, the motion of the 4 inner—respectively 4 outer—segments was never compatible with a rigid motion of a small—respectively large—diamond; however, depending on which segments were selectively used for motion integration, two identical rigid diamonds could be perceived as translating along two orthogonal axes (vertical and horizontal) or seen as moving along different circular trajectories (e.g., clockwise motion with a 90° phase lag between the 2 diamonds). The potential role of temporal modulation in motion selection was assessed by flickering 4 different component segments in phase at varying temporal frequencies, while the other 4 segments flickered at the same temporal frequencies but out of phase. 
Methods and stimuli
These experiments were done with a higher refresh rate (100 Hz) than in previous experiments. Different temporal modulation—flicker—frequencies (0, 2.5, 5, 10, 20, and 40 Hz) were chosen and tested within a single block of 150 trials (25 trials per condition). Note that at low temporal frequencies, each diamond is presented alone and moves during a fraction of time—e.g., 200 ms at 5 Hz—before it is replaced by the second moving diamond in a cycle. Although the alternation between the two diamonds is clearly seen at the lowest temporal frequencies, it is not seen at temporal frequencies above 10 Hz (although a perception of flicker was possible). Nine observers with normal or corrected-to-normal vision participated in this study. 
Results
The data obtained with the first display—in which the grouping of inner segments can yield a rigid percept of a small diamond—are shown in Figure 9. In this figure, the percentage of perceived expansion/contraction is displayed as a function of temporal frequency. It is clear that at low temporal frequency (<10 Hz), the perceptual reports are strongly biased toward seeing translation; this is by no way surprising as each translating diamond is presented alone for a fraction of second, at a duration sufficient to temporally segregate the two diamonds and extract their global motion direction. 
Figure 9
 
Results of Experiment 4 showing the proportion of seen translation averaged across observers as a function of flicker frequency. NA corresponds to a no-flicker condition where expansion/contraction is seen. Error bars represent ±1 standard error.
Figure 9
 
Results of Experiment 4 showing the proportion of seen translation averaged across observers as a function of flicker frequency. NA corresponds to a no-flicker condition where expansion/contraction is seen. Error bars represent ±1 standard error.
At higher temporal frequencies, the alternation rate between the two diamonds increases up to a point where it is no longer possible to perceive the two diamonds independently, as they are processed within a single temporal integration window. 
An ANOVA confirms the strong effect of temporal frequency ( F(5,40) = 94.40; p < 0.001). If high flicker frequency in the gamma range was used to tag the segments and constrain their integration as perceptual units, one would expect that the perception of translating diamonds would dominate over the perception of expansion/contraction found in the previous experiment with no temporal modulation. This is not what is observed, indicating that temporally tagging the stimuli in the gamma range has little effect on perceptual grouping. However, with this display, this temporal modulation ought to overcome the strong form constraint related to gestalt principles found in Experiment 1 in order to yield a perception of two translating diamonds. It is therefore possible that the conflict between the temporal and form cues has prevented the possibility of revealing the influence of temporal modulation on motion selection. 
The second set of results was obtained with a display which prevents the use of a form constraint ( Figure 10). Depending on which moving segments are selected and bound together, this stimulus can be seen as two diamonds translating sinusoidally along orthogonal axes, or as two diamonds translating along two circular trajectories with a 90° phase lag. The four inner—versus outer—segments are never compatible with a rigidly moving diamond, such that form constraints relying on proximity and closure are irrelevant in this case. 
Figure 10
 
Stimulus of Experiment 4: Depending on the segments selected for motion grouping 2 diamonds translating up–down and left–right (left) or 2 diamonds rotating along a circular trajectory with a 90° phase lag (right) can be seen. Red and green are used here for clarity. Black and white segments were used in a control condition. In test conditions, all segments were white and flickered out of phase at a high temporal frequency (40 or 27 Hz). See Movie 2 for a demo of the possible groupings based on green/red colour cues, and the ambiguity existing when no such cues are available.
Figure 10
 
Stimulus of Experiment 4: Depending on the segments selected for motion grouping 2 diamonds translating up–down and left–right (left) or 2 diamonds rotating along a circular trajectory with a 90° phase lag (right) can be seen. Red and green are used here for clarity. Black and white segments were used in a control condition. In test conditions, all segments were white and flickered out of phase at a high temporal frequency (40 or 27 Hz). See Movie 2 for a demo of the possible groupings based on green/red colour cues, and the ambiguity existing when no such cues are available.
10.1167/8.8.13.M2 Movie 2
Two control conditions, in which the segments where black and white so as to favor one interpretation or the other, were compared with test conditions in which the different interpretations would result from grouping based solely on coherent temporal modulation. In this experiment only two frequencies, in the gamma range, were used. 
The results ( Figure 11) showing the percentage of seen rotation as a function of the different conditions can be summarized as follows: 
Figure 11
 
Results of Experiment 4: The percentage of seen rotation is plotted as a function of the experimental conditions: B&W Rot: black and white rotating diamonds; B&W Trans: black and white translating diamonds; 40 Hz Rot. and 27 Hz Rot: flickering rotating diamonds; 40 Hz Trans and 27 Hz Trans: flickering translating diamonds. Error bars 1 standard error.
Figure 11
 
Results of Experiment 4: The percentage of seen rotation is plotted as a function of the experimental conditions: B&W Rot: black and white rotating diamonds; B&W Trans: black and white translating diamonds; 40 Hz Rot. and 27 Hz Rot: flickering rotating diamonds; 40 Hz Trans and 27 Hz Trans: flickering translating diamonds. Error bars 1 standard error.
Segment polarity has a clear influence on motion interpretation. Observers ( n = 8) perceive 2 diamonds translating up–down and left–right depending on the spatial distribution of black and white segments. However, when this distribution favors the perception of translation, this percept is somewhat less frequent than expected, suggesting that it was not as salient. 
When the only cue available to select the component motions to be bound together is temporal flicker frequency, the proportion of perceived translation and rotation significantly drops. Interestingly, the percentage of expansion/contraction is larger at 40 Hz as compared to 27 Hz ( F(1,7) = 7.626, p < 0.02). 
However, the pattern of results in not compatible with the “binding-by-synchrony hypothesis” as a bias toward seeing expansion/contraction is found in all conditions. A simple account of this bias is that because both diamonds move 90° out of phase integrating all segment motions into a whole without segmenting them into independent moving diamonds yields a global perception of circular translation. This bias toward seeing more rotation presumably reflects the limits of the selection process. Thus, in contrast with some other studies, coherent temporal modulation of the stimulus does not seem to help selecting motion components and inducing binding into independent perceptual units. 
Discussion and conclusion
The present experiments aimed at identifying the constraints used in selecting motion components so as to recover several object motions from disparate component motions and thus solve the superposition catastrophe problem. The results point toward a strong role of both local and global form constraints in this process. 
In Experiment 1, the “local” (but see McDermott et al., 2001) motion of line-endings resulting from occlusion, whether intrinsic or extrinsic, appears to determine the parsing of component motions into objects. This suggests that when line-endings share a common fate, as is the case for vertical or horizontal apertures that entail line-ending motion along a single motion axis, it is used as a primarily constraint to select component motions, although with a rather slow dynamics. In Experiment 2, similarity, either of size or of contrast polarity, does not appear as a strong constraint for motion selection since only modest biases for grouping segment motion on the basis of these features are found. At least, similarity and polarity are not strong enough to overcome grouping by proximity and closure (see Experiment 3), where motion selection appears to rely on gestalt rules such as good continuation, proximity and closure, all related to static form processing. Indeed, in these configurations, the four inner—respectively outer—segments are first grouped. Motion integration is performed on these selected motion components, whatever the type of motion to which this integration leads, expanding/contracting or translating diamonds. 
Finally, Experiment 4 shows that temporal modulation in the gamma range has little effect on motion selection. Since the lack of an effect of this parameter may have resulted from the strong constraint related to form information, this form information was removed in Experiment 4 such that the 4 inner segments do not define a rigid shape anymore. In this situation, temporal grouping does not seem to drive motion selection either. We do not think, however, that this finding strongly argues against the binding-by-synchrony hypothesis (Singer, 1999). Temporally modulating the stimulus presumably recruits a population of neurons that respond in the same temporal window (think to EEG of MEG imaging techniques). However, neuronal response latencies varies with a number of parameters—e.g., contrast, spatial frequency, cell type—and depend on the relationships between the neuronal selectivity and the actual stimulus as well as on the current state of the neuronal network. Thus, temporal modulation may not elicit neuronal synchronization in the millisecond range (Singer, 1999) that could be engaged in, or responsible for, the formation of functional neuronal assemblies devoted to some particular processing of the stimulus that could engage cortico-cortical interactions. 
Temporal modulation is used here to test whether observers can use this temporal cue to select and bind motion components into a whole. The results suggest it is used only when observers are able to consciously perceive temporal flicker, despite the fact that neurons in thalamus and primary visual cortex may follow these unseen temporal modulations. We do not think, however, that this can be used to decipher whether form/motion binding induces is correlated to or relies on synchronization, which should be an intrinsic functional property of the neuronal network, “blind” to extrinsic manipulations of the physical stimulus. 
Overall, the results suggest a hierarchy of constraints in which local cues—line-endings—are more powerful to drive selection than global form cues—closure—with only a modulatory effect of similarity—in polarity or spatial frequency—or of temporal modulation. 
Additional observations with stereoscopic displays indicate that recovering the motion of two diamonds moving in different depth planes is easy, suggesting depth cues are a powerful vector for selection. Finally, asking observers to selectively attend to a subset of component motions permits the selection of one diamond and its global motion independently of the second diamond. In this way, it is possible to voluntarily switch from one percept to another. Although this shows that attention can deeply modify the phenomenal appearance of these stimuli, it is worth noting that such attentional processes operate only if the observer is cued toward specific locations. This cueing can hardly be used in the example of Figure 1 where the larger number of component motions might prevent the attentional selection of a large subset of segments in order to integrate selectively their respective motion into a rigidly moving shape. In this respect, attentional control of motion selection requires the prior knowledge of a perceptual expected outcome, which can only emerge after bottom-up selection and integration took place. 
These psychophysical results cannot by themselves provide insights into the neural implementation of a selection process. Let us just note that several models of motion integration consider that cortical areas MT/MST, where most neurons are direction selective and contain cells that respond to a global integrated stimulus direction rather than to its motion components (Duffy & Wurtz, 1995; Movshon et al., 1986). Modeling these MT cells has often involved computations performed in a velocity space-lacking spatial organization (e.g., Rust, Mante, Simoncelli, & Movshon, 2006; Simoncelli & Heeger, 1998), such that the spatial structure of their inputs was not taken into consideration. These and other results suggest that form constraints gate the motion combination process, thereby calling for either an early selection guided by form constraints (Lorenceau & Alais, 2001), interactions between form and motion processing streams or integration of form and motion at further processing stages. Although numerous studies uncovered the constraints and mechanisms underlying early contour integration into global form (for a review, see Hess, Hayes, & Field, 2003), recent electrophysiological recordings also support the latter hypotheses: Huang, Albright, and Stoner (2007) found a strong modulation of the response of MT neurons by contextual stimuli such as a shape surrounding the MT receptive fields, which may reflect interactions between form and motion processing streams. On the other hand, some neurons in area STPa (Jellema, Maassen, & Perrett, 2004; Oram & Perret, 1994) fire selectively to both form and motion, suggesting a possible site for form/motion integration. 
Acknowledgments
This work was funded by ACI “Neurosciences Computationnelles” to JL. 
Commercial relationships: none. 
Corresponding author: Jean Lorenceau. 
Address: LENA CNRS-UPMC, 47 Bd de l'Hpital, 75013, Paris, France. 
References
Adelson, E. H. Movshon, J. A. (1982). Phenomenal coherence of moving visual patterns. Nature, 300, 523–525. [PubMed] [CrossRef] [PubMed]
Alais, D. Blake, R. Lee, S. H. (1998). Visual features that vary together over time group together over space. Nature Neuroscience, 1, 160–164. [PubMed][Article] [CrossRef] [PubMed]
Blakemore, C. Campbell, F. W. (1969). On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. The Journal of Physiology, 203, 237–260. [PubMed] [Article] [CrossRef] [PubMed]
Campbell, F. W. Robson, J. G. (1968). Application of Fourier analysis to the visibility of gratings. The Journal of Physiology, 197, 551–566. [PubMed] [Article] [CrossRef] [PubMed]
Duffy, C. J. Wurtz, R. H. (1995). Response of monkey MST neurons to optic flow stimuli with shifted centers of motion. Journal of Neuroscience, 15, 5192–5208. [PubMed] [Article] [PubMed]
Duncan, R. O. Albright, T. D. Stoner, G. R. (1994). Occlusion and the interpretation of visual motion: Perceptual and neuronal effects of context. Journal of Neuroscience, 20, 5885–5897. [PubMed] [Article]
Eckhorn, R. (2000). Cortical synchronization suggests neural principles of visual feature grouping. Acta Neurobiologiae Experimentalis, 60, 261–269. [PubMed] [PubMed]
Eckorn, R. Bauer, R. Jordan, W. Brosch, M. Kruse, W. Munk, M. (1988). Coherent oscillations: A mechanism of feature linking in the visual cortex? Biological Cybernetics, 60, 218–226. [PubMed]
Farid, H. Adelson, E. H. (2001). Synchrony does not promote grouping in temporally structured displays. Nature Neuroscience, 4, 875–876. [PubMed] [Article] [CrossRef] [PubMed]
Fennema, C. L. Thompson, W. B. (1979). Velocity determination in scenes containing several moving objects. Computer Graphics and Image Processing, 9, 301–315. [CrossRef]
Gibson, J. J. (1979). The ecological approach to visual perception. Hillsdale, NJ: Lawrence Erlbaum Associates.
Gray, C. M. Singer, W. (1989). Stimulus-specific neuronal oscillations in orientation columns of cat visual cortex. Proceedings of the National Academy of Sciences of the United States of America, 86, 1698–1702. [PubMed] [Article] [CrossRef] [PubMed]
Grossberg, S. Mingolla, E. Viswanathan, L. (2001). Neural dynamics of motion integration and segmentation within and across apertures. Vision Research, 41, 2521–2553. [PubMed] [CrossRef] [PubMed]
Hess, R. F. Hayes, A. Field, D. J. (2003). Contour integration and cortical processing. Journal of Physiology (Paris), 97, 105–119. [PubMed] [CrossRef]
Huang, X. Albright, T. D. Stoner, G. R. (2007). Adaptive surround modulation in area MT. Neuron, 53, 761–770. [PubMed] [Article] [CrossRef] [PubMed]
Jellema, T. Maassen, G. Perrett, D. I. (2004). Single cell integration of animate form, motion and location in the superior temporal cortex of the macaque monkey. Cerebral Cortex, 14, 781–790. [PubMed] [Article] [CrossRef] [PubMed]
Koechlin, E. Anton, J. L. Burnod, Y. (1999). Bayesian inference in populations of cortical neurons: A model of motion integration and segmentation in area MT. Biological Cybernetics, 80, 25–44. [PubMed] [CrossRef] [PubMed]
Koffka, K. (1935). Principles of gestalt psychology. New York: Harcourt, Brace and Co.
Kovacs, I. Julesz, B. (1993). A closed curve is much more than an incomplete one: Effect of closure in figure-ground segmentation. Proceedings of the National Academy of Sciences of the United States of America, 90, 7495–7497. [PubMed] [Article] [CrossRef] [PubMed]
Lee, S. H. Blake, R. (1999). Visual form created solely from temporal structure. Science, 284, 1165–1168. [PubMed] [CrossRef] [PubMed]
Leonards, U. Singer, W. Fahle, M. (1996). The influence of temporal phase differences on texture segmentation. Vision Research, 36, 2689–2697. [PubMed] [CrossRef] [PubMed]
Líden, L. Pack, C. (1999). The role of terminators and occlusion cues in motion integration and segmentation: A neural network model. Vision Research, 39, 3301–3320. [PubMed] [CrossRef] [PubMed]
Lorenceau, J. (1996). Motion integration with dot patterns: Effects of motion noise and structural information. Vision Research, 36, 3415–3427. [PubMed] [CrossRef] [PubMed]
Lorenceau, J. Alais, D. (2001). Form constraints in motion binding. Nature Neuroscience, 4, 745–751. [PubMed] [Article] [CrossRef] [PubMed]
Lorenceau, J. Boucart, M. (1995). Effects of a static textured background on motion integration. Vision Research, 35, 2303–2314. [PubMed] [CrossRef] [PubMed]
Lorenceau, J. Giersh, A. Seriès, P. (2005). Dynamic competition between contour integration and contour segmentation probed with moving stimuli. Vision Research, 45, 103–116. [PubMed] [CrossRef] [PubMed]
Lorenceau, J. Shiffrar, M. (1992). The influence of terminators on motion integration across space. Vision Research, 32, 263–273. [PubMed] [CrossRef] [PubMed]
Lorenceau, J. Shiffrar, M. (1999). The linkage of visual motion signals. Visual Cognition, 6, 431–460. [CrossRef]
Lorenceau, J. Shiffrar, M. Wells, N. Castet, E. (1993). Different motion sensitive units are involved in recovering the direction of moving lines. Vision Research, 33, 1207–1217. [PubMed] [CrossRef] [PubMed]
Lorenceau, J. Zago, L. (1999). Cooperative and competitive spatial interactions in motion integration. Visual Neuroscience, 16, 755–770. [PubMed] [CrossRef] [PubMed]
McDermott, J. Weiss, Y. Adelson, E. H. (2001). Beyond junctions: Nonlocal form constraints on motion interpretation. Perception, 30, 905–23. [PubMed] [CrossRef] [PubMed]
Mingolla, E. Todd, J. T. Norman, J. F. (1992). The perception of globally coherent motion. Vision Research, 32, 1015–1031. [PubMed] [CrossRef] [PubMed]
Movshon, A. J. Adelson, E. H. Gizzi, M. S. Newsome, W. T. (1986). The analysis of moving visual patterns. Experimental Brain Research, 11, 117–152.
Nowlan, S. J. Sejnowski, T. J. (1995). A selection model for motion processing in area MT of primates. Journal of Neuroscience, 15, 1195–1214. [PubMed] [Article] [PubMed]
Oram, M. Perrett, D. (1994). Responses of anterior superior temporal polysensory (STPa neurons to “biological motion” stimuli. Journal of Cognitive Neuroscience, 6, 99–116. [CrossRef] [PubMed]
Palanca, B. J. DeAngelis, G. C. (2005). Does neuronal synchrony underlie visual feature grouping? Neuron, 46, 333–346. [PubMed] [Article] [CrossRef] [PubMed]
Qiu, F. T. Sugihara, T. von der Heydt, R. (2007). Figure-ground mechanisms provide structure for selective attention. Nature Neuroscience, 10, 1492–1499. [PubMed] [CrossRef] [PubMed]
Rubin, N. Hochstein, S. (1993). Isolating the effect of one-dimensional motion signals on the perceived direction of moving two-dimensional objects. Vision Research, 10, 1385–1396. [PubMed] [CrossRef]
Rust, N. C. Mante, V. Simoncelli, E. P. Movshon, J. A. (2006). How MT cells analyze the motion of visual patterns. Nature Neuroscience, 9, 1421–1431. [PubMed] [CrossRef] [PubMed]
Shiffrar, M. Lorenceau, J. (1996). Increased motion linking across edges with decreased luminance contrast, edge width and duration. Vision Research, 36, 2061–2067. [PubMed] [CrossRef] [PubMed]
Shimojo, S. Silverman, G. H. Nakayama, K. (1989). Occlusion and the solution to the aperture problem for motion. Vision Research, 29, 619–626. [PubMed] [CrossRef] [PubMed]
Simoncelli, E. P. Heeger, D. (1998). A model of neuronal responses in visual area MT. Vision Research, 38, 743–761. [PubMed] [CrossRef] [PubMed]
Singer, W. (1999). Neuronal synchrony: A versatile code for the definition of relations? Neuron, 24, 49–65. [PubMed] [Article] [CrossRef] [PubMed]
Stoner, G. Albright, T. Smith, A. Snowden, R. J. (1994). Visual motion integration: A neurophysiological and psychophysical perspective. Visual detection of motion. London: Academic Press.
Stoner, G. R. Albright, T. D. (1992). Neural correlates of perceptual motion coherence. Nature, 358, 412–414. [PubMed] [CrossRef] [PubMed]
Thiele, A. Stoner, G. (2003). Neuronal synchrony does not correlate with motion coherence in cortical area MT. Nature, 421, 366–370. [PubMed] [CrossRef] [PubMed]
von der Malsburg, C. (1981). The correlation theory of brain function. Gottingen: Max Planck Institute for Biophysical Chemistry.
von der Malsburg, C. (1999). The what and why of binding: The modeler's perspective. Neuron, 24, 95–104. [PubMed] [Article] [CrossRef] [PubMed]
von der Malsburg, C. Schneider, W. (1986). A neural cocktail-party processor. Biological Cybernetics, 54, 29–40. [PubMed] [CrossRef] [PubMed]
Watamaniuk, S. N. J. Sekuler, R. (1992). Temporal and spatial integration in dynamic random-dot stimuli. Vision Research, 32, 2341–2348. [PubMed] [CrossRef] [PubMed]
Watson, A. B. (1982). Summation of grating patches indicates many types of detector at one retinal location. Vision Research, 22, 17–25. [PubMed] [CrossRef] [PubMed]
Weiss, Y. Adelson, E. H. (1996). A unified mixture framework for motion segmentation: Incorporating spatial coherence and estimating the number of models. 1996 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'96) (p. 321)
Williams, D. W. Sekuler, R. (1984). Coherent global motion percepts from stochastic local motions. Vision Research, 24, 55–62. [PubMed] [CrossRef] [PubMed]
Wilson, H. R. Bergen, J. R. (1979). A four mechanism model for threshold spatial vision. Vision Research, 19, 19–32. [CrossRef] [PubMed]
Yo, C. Wilson, H. R. (1992). Perceived direction of moving two-dimensional patterns depends on duration, contrast and eccentricity. Vision Research, 32, 135–147. [PubMed] [CrossRef] [PubMed]
Yu, C. Levi, D. M. (1999). The time course of psychophysical end-stopping. Vision Research, 39, 2063–2073. [PubMed] [CrossRef] [PubMed]
Wilson, H. R. Kim, J. (1994). A model for motion coherence and transparency. Visual Neuroscience, 11, 1205–1220. [PubMed] [CrossRef] [PubMed]
Nowlan, S. J. Sejnowski, T. J. (1994). Filter selection model for motion segmentation and velocity integration. Journal of the Optical Society of America, 11, 3177–3200. [CrossRef]
Pack, C. C. Born, R. T. (2001). Temporal dynamics of a neural solution to the aperture problem in visual area MT of macaque brain. Nature, 409, 1040–1042. [PubMed] [CrossRef] [PubMed]
Rosenblatt, F. (1961). Principles of neurodynamics: Perceptrons and the theory of brain mechanisms..
Singer, W. Gray, C. M. (1995). Visual feature integration and the temporal correlation hypothesis. Annual Review of Neuroscience, 18, 555–586. [PubMed] [CrossRef] [PubMed]
Figure 1
 
(Top) The problem of motion selection illustrated with two geometrical shapes moving in different directions behind apertures. (Bottom) Schematic representation of the activity evoked in primary visual cortex by component motions. Recovering global object motion requires that the visual system combines component motions from the same shape and avoids spurious combinations with component motions from a different shape, i.e., solve the superposition catastrophe problem.
Figure 1
 
(Top) The problem of motion selection illustrated with two geometrical shapes moving in different directions behind apertures. (Bottom) Schematic representation of the activity evoked in primary visual cortex by component motions. Recovering global object motion requires that the visual system combines component motions from the same shape and avoids spurious combinations with component motions from a different shape, i.e., solve the superposition catastrophe problem.
Figure 2
 
Stimuli used in the experiment. A static view of two diamonds seen behind apertures (top) can be interpreted in different ways (bottom): overlapping diamonds of same size (a, c) or a small diamond surrounded by a large one (b). Observers generally favor this latter interpretation, as expected if good continuation, proximity, and closure are used to group the line segments into a whole.
Figure 2
 
Stimuli used in the experiment. A static view of two diamonds seen behind apertures (top) can be interpreted in different ways (bottom): overlapping diamonds of same size (a, c) or a small diamond surrounded by a large one (b). Observers generally favor this latter interpretation, as expected if good continuation, proximity, and closure are used to group the line segments into a whole.
Figure 3
 
When two diamonds of different (top) or equal (bottom) size seen behind crossed apertures oscillate out of phase in opposite directions (large arrows), the local component motions (thin arrows) can be combined in different ways, resulting either in the perception of two translating diamonds or of two expanding/contracting diamonds.
Figure 3
 
When two diamonds of different (top) or equal (bottom) size seen behind crossed apertures oscillate out of phase in opposite directions (large arrows), the local component motions (thin arrows) can be combined in different ways, resulting either in the perception of two translating diamonds or of two expanding/contracting diamonds.
Figure 4
 
Top: Proportion of perceived expansion, averaged across six observers, plotted as a function of aperture orientation for visible and invisible apertures at two durations of motion. Bottom: Standard deviations across observers as a function of aperture orientation. See text for details.
Figure 4
 
Top: Proportion of perceived expansion, averaged across six observers, plotted as a function of aperture orientation for visible and invisible apertures at two durations of motion. Bottom: Standard deviations across observers as a function of aperture orientation. See text for details.
Figure 5
 
Stimuli used in Experiment 2 with line segments of varying widths and polarity. In the experiment, the diamonds translated sinusoidally in opposite phase. Observers were asked whether they perceive two translating diamond (of the same size) or two expanding/contracting diamonds (of different sizes).
Figure 5
 
Stimuli used in Experiment 2 with line segments of varying widths and polarity. In the experiment, the diamonds translated sinusoidally in opposite phase. Observers were asked whether they perceive two translating diamond (of the same size) or two expanding/contracting diamonds (of different sizes).
Figure 6
 
Proportion of perceived expansion, averaged across six observers, plotted for the stimuli shown in Figure 5. Error bars represent one standard error.
Figure 6
 
Proportion of perceived expansion, averaged across six observers, plotted for the stimuli shown in Figure 5. Error bars represent one standard error.
Figure 7
 
Stimuli used in Experiment 3. Diamonds of equal or different sizes were made of segments of similar or different widths. In each trial, the 2 diamonds translated sinusoidally out of phase along a horizontal axis.
Figure 7
 
Stimuli used in Experiment 3. Diamonds of equal or different sizes were made of segments of similar or different widths. In each trial, the 2 diamonds translated sinusoidally out of phase along a horizontal axis.
Figure 8
 
Proportion of perceived translation, averaged across six observers, plotted for the stimuli shown in Figure 7. Error bars represent ±1 standard error.
Figure 8
 
Proportion of perceived translation, averaged across six observers, plotted for the stimuli shown in Figure 7. Error bars represent ±1 standard error.
Figure 9
 
Results of Experiment 4 showing the proportion of seen translation averaged across observers as a function of flicker frequency. NA corresponds to a no-flicker condition where expansion/contraction is seen. Error bars represent ±1 standard error.
Figure 9
 
Results of Experiment 4 showing the proportion of seen translation averaged across observers as a function of flicker frequency. NA corresponds to a no-flicker condition where expansion/contraction is seen. Error bars represent ±1 standard error.
Figure 10
 
Stimulus of Experiment 4: Depending on the segments selected for motion grouping 2 diamonds translating up–down and left–right (left) or 2 diamonds rotating along a circular trajectory with a 90° phase lag (right) can be seen. Red and green are used here for clarity. Black and white segments were used in a control condition. In test conditions, all segments were white and flickered out of phase at a high temporal frequency (40 or 27 Hz). See Movie 2 for a demo of the possible groupings based on green/red colour cues, and the ambiguity existing when no such cues are available.
Figure 10
 
Stimulus of Experiment 4: Depending on the segments selected for motion grouping 2 diamonds translating up–down and left–right (left) or 2 diamonds rotating along a circular trajectory with a 90° phase lag (right) can be seen. Red and green are used here for clarity. Black and white segments were used in a control condition. In test conditions, all segments were white and flickered out of phase at a high temporal frequency (40 or 27 Hz). See Movie 2 for a demo of the possible groupings based on green/red colour cues, and the ambiguity existing when no such cues are available.
10.1167/8.8.13.M2 Movie 2
Figure 11
 
Results of Experiment 4: The percentage of seen rotation is plotted as a function of the experimental conditions: B&W Rot: black and white rotating diamonds; B&W Trans: black and white translating diamonds; 40 Hz Rot. and 27 Hz Rot: flickering rotating diamonds; 40 Hz Trans and 27 Hz Trans: flickering translating diamonds. Error bars 1 standard error.
Figure 11
 
Results of Experiment 4: The percentage of seen rotation is plotted as a function of the experimental conditions: B&W Rot: black and white rotating diamonds; B&W Trans: black and white translating diamonds; 40 Hz Rot. and 27 Hz Rot: flickering rotating diamonds; 40 Hz Trans and 27 Hz Trans: flickering translating diamonds. Error bars 1 standard error.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×