Free
Research Article  |   August 2008
Distortion in perceived image size accompanies flash lag in depth
Author Affiliations
  • Terence C. P. Lee
    Department of Psychology, The University of Hong Kong, Hong Kong SAR, China
    Psychology Unit, Hong Kong Baptist University, Hong Kong SAR, Chinaterencel@hkbu.edu.hk
  • Sieu K. Khuu
    Department of Psychology, The University of Hong Kong, Hong Kong SAR, China
    School of Optometry and Vision Science, The University of New South Wales, Sydney, NSW, Australias.khuu@unsw.edu.au
  • Wang Li
    Department of Psychology, The University of Hong Kong, Hong Kong SAR, China
    Department of Counselling and Psychology, Hong Kong Shue Yan University, Hong Kong SAR, Chinawoli@hksyu.edu
  • Anthony Hayes
    Department of Psychology, The University of Hong Kong, Hong Kong SAR, China
    School of Psychology, University College Dublin, Belfield, IrelandTony.Hayes@ucd.ie
Journal of Vision August 2008, Vol.8, 20. doi:10.1167/8.11.20
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Terence C. P. Lee, Sieu K. Khuu, Wang Li, Anthony Hayes; Distortion in perceived image size accompanies flash lag in depth. Journal of Vision 2008;8(11):20. doi: 10.1167/8.11.20.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

The flash-lag effect—a misperception that a flashed object appears to lag behind a moving object despite their physical alignment—has mainly been investigated as a spatiotemporal offset. Here, we report that the flash-lag-in-depth effect is accompanied by an illusory change in the apparent size of the flashed object. We found a strong flash-lag-in-depth effect with a dot-defined square, whose motion in depth was signaled by changing retinal disparity (stereomotion), and a Gaussian blob that was flashed in the center of the square. Using the same stimulus, observers matched the apparent size of the flashed blob with a reference blob when the square moved with approaching or receding motion. Approaching motion of the square resulted in a reduction in the apparent size of the flashed blob, and an apparent enlargement of the flashed blob was induced by receding motion of the square. Additionally, this size effect substantially diminished, or was eliminated, when looming (change of size) instead of stereomotion was used to cue motion in depth of the square. The flashed-object size change that is induced by the moving square is not explained by simple predictions from projective geometry.

Introduction
The human visual system is often required to determine the properties of objects viewed in motion or viewed in the context of a dynamic background. This contextual motion can significantly influence the judgment of object properties. For example, motion has been shown strongly to influence the apparent position of objects (e.g., De Valois & De Valois, 1991; Freyd & Finke, 1984; Fröhlich, 1923; Hayes, 2000). A particularly well-documented phenomenon demonstrating perceptual distortion of position is the flash-lag effect, in which a flashed object that is physically aligned with a moving object is perceived to be located behind—lagging behind—the moving object (MacKay, 1958; Nijhawan, 1994). 
While a number of studies have been conducted on the flash-lag effect, and several models have been proposed to account for it (e.g., Baldo & Klein, 1995; Eagleman & Sejnowski, 2000; Krekelberg & Lappe, 2000; Nijhawan, 1994; Whitney & Murakami, 1998), most investigations have been concerned with the influence of frontoparallel motion on the estimation of position in the two-dimensional image plane (2D). Not until recently have there been reports of a flash-lag effect along the line of sight—an apparent shift of position in depth induced by motion in depth (see, e.g., Harris, Duke, & Kopinska, 2006; Ishii, Seekkuarachchi, Tamura, & Tang, 2004). Flash lag in the image plane and flash lag in the median plane are unlikely to result simply from idiosyncrasies inherent in mechanisms that encode position. While the neural representation of an object's 2D position and motion can be traced back to the spatiotemporal changes in luminance encoded by retinal cells that are local to each other, the recovery of position in depth and motion in depth demands analyses of retinal images such as, with stereo depth, the extraction of the difference between positions registered by the pair of retinae (binocular disparity) and how these differences change over time (stereomotion). Since the perception of flash lag in depth implies the use of depth cues by the visual system to estimate relative positions in the median plane, the stimulus becomes a useful probe for understanding the visual system's ability to construct a representation in the third dimension (3D). In the present study, we employed a flash-lag-in-depth stimulus to investigate the contributing factors underlying the visual system's ability to derive an estimate of an object's image size in depth. 
In published studies of flash lag in depth (e.g., Harris et al., 2006; Ishii et al., 2004), the spatial offset that occurred between the flashed object and the moving object was primarily defined by retinal disparity. In natural scenes, the visual system derives an estimate of distance in depth from a host of visual cues, among which an influential cue is retinal size, which scales inversely with distance from the observer, given an object of constant physical size (Emmert, 1881, in Boring, 1940). The intimate relationship between apparent size and apparent distance in depth has been illustrated by classic size illusions such as the Ames room (Ames, 1952), the moon illusion (see Rock & Kaufman, 1962), and the Ponzo illusion (see Gregory, 1966). It is reasonable to suppose that inherent in the illusion of flash lag in depth is a 3D perceptual space, and an intriguing question is whether the apparent displacement in depth of the flashed object is accompanied by a change in its apparent size. 
Kanai and Verstraten (2006), with a 2D flash-lag configuration, showed that the aligned position and the displaced position of the moving object can be perceived concurrently, and they suggested that the two positions are represented in early cortical areas and in higher visual areas, respectively. With flash lag in depth, if both the aligned and displaced positions of the flashed and moving objects are represented by the visual system, the question arises whether the visual system, when estimating the size of the flashed object, derives a size estimate from comparison with the moving object's aligned position, or from its illusory displaced position, in the median plane. In the former case, a veridical percept of size would be expected regardless of the moving object's apparent position in depth, whereas in the latter case, if the visual system derives an estimate of size based on the moving object's illusory position, the flashed object may appear larger or smaller depending on the moving object's direction of motion in depth. 
To investigate whether the visual system is susceptible to a distortion in the apparent size of an object that is flashed alongside another object that is moving in depth, we adopted a modified version of the flash-lag paradigm. In Experiment 1 we established and quantified the flash-lag-in-depth effect; in Experiment 2, using the same stimulus configuration as in Experiment 1, we measured the apparent size of the flashed object. 
In Experiments 1 and 2, motion in depth was generated primarily by stereomotion. However, in natural scenes stereomotion is invariably accompanied by looming: a 2D change in image size. A number of studies have shown that looming and stereomotion mutually contribute to the processing of motion in depth (e.g., Gray, Macuga, & Regan, 2004; Gray & Regan, 1998; Heuer, 1993; Regan & Beverley, 1979; Rushton & Wann, 1999; Savelsbergh, Whiting, & Bootsma, 1991). Since our interest is in a possible size effect that is contingent on position and movement in 3D perceptual space (constructed from a variety of cues), and some published findings show task dependency for depth-cue combination (e.g., Bradshaw, Parton, & Glennerster, 2000; Brenner & van Damme, 1999; Hillis, Ernst, Banks, & Landy, 2002), in Experiment 3 we investigated whether size distortion is produced with looming, and the results are compared to those with stereomotion. 
Experiment 1: Quantifying apparent displacement due to flash lag in depth
Size perception of a flashed object that appears to lag behind another object moving in depth is the main focus of the present study. However, conduct of the study is contingent on the establishment of a strong and reliable effect of flash lag in depth. Experiment 1 was designed to quantify the flash-lag effect with our stimulus: a dot-defined square traversing depth at different speeds, with its motion in depth cued by changing retinal disparity (stereomotion). A luminance blob, flashed in the center of the square, had its apparent relative position in depth measured using method of adjustment. The position in depth of the flashed blob was adjusted until it appeared to align in depth with the moving square. Flash lag in depth has been reported in published investigations as the temporal offset between the flashed object and the moving object that is required for their perceptual alignment (see Harris et al., 2006). In the present study we quantified flash lag in depth using a disparity adjustment so as to obtain a direct measure of perceived displacement in depth. 
Method
Observers
Three experienced psychophysical observers (TL, AL, and SK) participated in the experiment. All three observers have good stereopsis and normal or corrected-to-normal visual acuity. 
Stimuli
The stimuli were pairs of stereograms containing an orthographically presented square (4.63° × 4.63° at zero disparity) defined by 120 white, circular, anti-aliased dots (132 cd m −2; diameter 0.11°) that occupied randomly chosen positions on a gray background (30 cd m −2). Dots were prevented from overlapping and were prevented from appearing in a central square region (1.52° × 1.52°) of the stimulus. To prevent the tracking of individual dots, all dots were generated asynchronously, with a limited lifetime of 333 ms, and were replotted back into the square when they expired. Figure 1A illustrates the stimulus arrangement. 
Figure 1
 
(A) The setup used in Experiment 1. A dot-defined square moved toward or away from the observer, with motion in depth specified by stereomotion. Halfway through every motion sequence, a Gaussian blob was flashed in the center of the square. Observers were required to align the position in depth of the blob with the position in depth of the square. (B) An illustration of the setup used in Experiment 2. Observers were required to adjust so to match the size of blobs presented in the context of opposite directions of motion in depth of the square.
Figure 1
 
(A) The setup used in Experiment 1. A dot-defined square moved toward or away from the observer, with motion in depth specified by stereomotion. Halfway through every motion sequence, a Gaussian blob was flashed in the center of the square. Observers were required to align the position in depth of the blob with the position in depth of the square. (B) An illustration of the setup used in Experiment 2. Observers were required to adjust so to match the size of blobs presented in the context of opposite directions of motion in depth of the square.
The dot-defined square moved at one of four levels of speed in depth (6.27, 9.40, 12.54, and 15.67 cm s −1), which were simulated by horizontal movements of stereo-pair images in opposite directions at four monocular speeds (0.69, 1.04, 1.39 or 1.73 deg s −1). These disparity values were translated into depth values by the equation: η / D 2, where η is the disparity, I is the interocular separation, δ is the simulated depth, and D is the viewing distance. Speed in depth, V z, was derived from the equation: V zD 2 V δ / I, where V δ is the rate of change in disparity. Since the distance in depth traversed by the square was held constant (±4.18 cm), the stimulus duration was proportional to image speed. The stimulus durations were 1.33, 0.87, 0.67, or 0.53 s, which corresponded to the range of speed in depth from the slowest to the fastest. Two horizontal bars (1.52° × 0.07°) were continuously presented 3.85° above and below the center of the square to act as landmarks denoting zero disparity. At an intermediate point of every motion sequence, a Gaussian blob was presented in the central region of the square for 33 ms (two frames). The size of the Gaussian blob was held constant ( σ = 22.20 arcmin) (see 1). The luminance profile of the blob was defined by the equation: L( x, y) = C 0 exp − (( xx 0) 2 / σ x 2 + ( yy 0) 2 / σ y 2), where x 0 and y 0 denote the center position of the square, x and y denote the coordinates with respect to x 0 and y 0, σ x and σ y denote the horizontal and vertical standard deviations of the Gaussian, and C 0 denotes the peak luminance of the profile, which was set to 132 cd m −2 (Weber contrast, 340%). 
 
Movie 1
 
Anaglyph representation of the stereo-pair images for the dot-defined square and the Gaussian blob. Click on the image to view the movie.
The stimuli were generated on a Macintosh G4 1.3-GHz computer using custom software written in MATLAB (version 5.3) and were displayed on a calibrated monitor running at 60 Hz. A custom-built Wheatstone stereoscope was used to present stimuli at a viewing distance of 41 cm. 
Procedure
Observers were presented with a single motion sequence in each trial, and they were required to align the disparity-defined position in depth of the Gaussian blob with that of the dot-defined square when the blob appeared, using a modified method-of-adjustment procedure. Within every experimental run, there were two randomly interleaved sets of trials; one set consisted of approaching motion and the other consisted of receding motion. After the presentation of a motion sequence, observers pressed appropriate keys to, increase or decrease the disparity offset of the blob by a constant step size of 1.0 arcmin, repeat the presentation, or indicate that the blob appeared to be aligned with the square in depth. At the onset of each trial, the disparity of the blob was randomized within ±13.32 arcmin of zero disparity. Six runs were conducted and the results were averaged. 
Results
Average position-in-depth adjustments of the Gaussian blob are plotted in Figure 2 as offsets relative to the position in depth of the dot-defined square when the blob was flashed. Since the data collected for approaching and receding motion were of similar size, they were collapsed so that a zero offset indicates an alignment between the blob and the square at zero disparity. A positive offset indicates that the blob was displaced in depth along the direction of motion in depth (i.e., the blob was adjusted further away from the observer for receding motion, but closer to the observer for approaching motion), and a negative offset indicates a displacement in depth against the direction of motion in depth. Offsets are plotted as a function of the speed in depth of the square. 
Figure 2
 
Position-in-depth adjustments of the Gaussian blob are plotted as offsets relative to the physically aligned position in depth and as a function of speed in depth of the dot-defined square. Error bars represent ±1 SEM.
Figure 2
 
Position-in-depth adjustments of the Gaussian blob are plotted as offsets relative to the physically aligned position in depth and as a function of speed in depth of the dot-defined square. Error bars represent ±1 SEM.
The pattern of results is similar for all observers and shows strong positive offsets, which indicate that, according to the direction of motion in depth, observers placed the blob ahead of the square so as to perceive them both as aligned in depth. This finding demonstrates a compelling flash-lag-in-depth effect; i.e., the flashed blob appeared to lag behind the moving square in the median plane. Additionally, this flash-lag effect is speed dependent: the average offset increases from 0.13 to 0.67 cm as the speed of the square increases from 6.27 cm s −1 to 15.67 cm s −1, suggesting a larger flash lag in depth with faster speed in depth. An apparently different conclusion is drawn by Harris et al. (2006), who asked observers to null the flash-lag effect by adjusting the relative timing of the flashed and moving objects. Harris et al. found that the magnitude of flash lag, measured as a temporal offset, remained constant over a range of stereomotion speeds, and from these results they concluded that changing speed does not affect the perception of flash lag in depth. However, a constant flash lag in time for different speeds implies, for a faster speed, a larger perceived distance in depth between the flashed and moving objects. In other words, the present results are compatible with Harris et al., and both sets of results suggest that the spatial extent of flash lag in depth depends on the speed of the moving object. 
Experiment 2: Quantifying apparent size of an object flash lagged in depth
Having established a credible flash-lag-in-depth effect with our stimulus, in Experiment 2 we investigated whether flash lag in depth is also accompanied by a perceptible change in the apparent size of the flashed stimulus. As noted previously, a size change is indicative of a process of size estimation by the visual system that takes into account the illusory displaced position in depth of the flashed object. We measured the apparent size of the flashed object by employing two flash-lag-in-depth stimuli and observers were required to compare and match their apparent size using method of adjustment. 
Procedure
The setup of Experiment 2, as illustrated in Figure 1B, was similar to that of Experiment 1, except that corresponding dots in each of the two stereo images moved horizontally in opposite directions at 1.39 deg s −1, so that the dot-defined square appeared to move toward or away from the observer at 12.54 cm s −1. This speed in depth corresponded to the second fastest dot speed used in Experiment 1 and resulted in a flash-lag average displacement of 0.29–0.44 cm. On each trial, a motion sequence of approaching motion and a motion sequence of receding motion were presented in randomized order. The size of the Gaussian blob in one of the two directions of motion in depth (the reference sequence) was held constant ( σ = 22.20 arcmin). The task for observers was to adjust the size of the Gaussian blob in the opposite direction of motion (the test sequence) until it appeared to be equivalent in size with the blob in the reference sequence. This procedure resulted in two comparable conditions: in one condition the apparent size of the flashed object in the context of approaching motion was measured, whereas in the other condition a similar measure was made with receding motion. Observers pressed appropriate keys to increase or decrease the size of the blob in the test sequence by changing the standard deviation of the Gaussian at a constant step size of 0.56 arcmin, repeat the presentation, or indicate that the blobs appeared to match in size. At the onset of each trial, the standard deviation of the Gaussian in the test sequence was randomized within a range of 18.87 to 25.53 arcmin. Six or more trials were conducted for each condition, in interleaved random order, and the results were averaged. The three observers who participated in Experiment 1 and two observers naive to the aims of the experiment, with normal or corrected-to-normal vision (IT and LF), took part in Experiment 2
Results
Average adjustments made to the size of the Gaussian blob in the test sequence are plotted in Figure 3 as standard deviations of the Gaussian. Data collected for approaching and receding motion are plotted separately. The dashed line represents the size of the blob in the reference sequence ( σ = 22.20 arcmin). The patterns of results in Figure 3 are similar (with some individual variation) and all five observers adjusted the size of the test blob in the context of an approaching object so that it was larger than the reference blob for both stimuli to appear to be matched in size. The result, in the context of a receding object, is the opposite, and observers adjusted the size of the flashed object to be smaller than the reference stimulus to appear to match its size. These results indicate that with an approaching object a reduction in the apparent size of the flashed target blob resulted, while an apparent enlargement of the flashed target blob was induced with a receding object. 
Figure 3
 
Size adjustments of the Gaussian blob are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Error bars represent ±1 SEM. The dashed line shows the physical size of the blob ( σ = 22.20 arcmin).
Figure 3
 
Size adjustments of the Gaussian blob are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Error bars represent ±1 SEM. The dashed line shows the physical size of the blob ( σ = 22.20 arcmin).
Experiment 3: Size distortions with looming
An object in depth flashed alongside a moving object in depth, where depth is generated by stereomotion, is accompanied by an illusory size change of the flashed stimulus. However, under natural conditions stereomotion is accompanied by another motion-in-depth cue—looming—which represents a 2D dilation of image size. This contextual motion is expected to exert a compelling effect on size perception and to contribute to the size distortion reported in Experiment 2, given classic illusions of relative size perception such as the Ebbinghaus illusion (Ebbinghaus, 1885). Experiment 3 addressed this expectation, and we repeated our size measurements with stimuli whose motion in depth was defined exclusively by looming or was defined by both looming and stereomotion cues. The reason for the latter condition stems from the findings of many studies, and computational models, that account for the mutual processing of stereomotion and looming in a variety of behavioral tasks (e.g., Gray et al., 2004; Heuer, 1993; Savelsbergh et al., 1991). Gray and Regan (1998), for example, showed that speed discrimination thresholds are lower for stimuli jointly defined by stereomotion and looming than when these cues are presented individually. Moreover, models that encode and summate information given by stereomotion and looming provide an effective account for human performance (Regan & Beverley, 1979; Regan & Gray, 2000; Rushton & Wann, 1999). It is possible that a stronger size distortion may be present for the combined stereomotion and looming stimulus, and its comparison to the performance in the looming alone condition and in the stereomotion alone condition (obtained in Experiment 2) allows a comparison of the contribution of the two motion-in-depth cues. 
Method
Following the procedure adopted in Experiment 2, we investigated the contribution of looming to the size illusion with a modified stimulus in which looming was simulated by dots moving radially away from or toward the center of the square to produce expanding or contracting motion. Simple geometry was used to determine the dot coordinates by the equation: ( x′, y′) ≈ ( x * d / ( d + d′), y * d / ( d + d′)), where d is the viewing distance, d′ is the simulated depth from disparity, x and y are the coordinates with respect to the center of the square at zero disparity, and x′ and y′ are the re-scaled coordinates. While this equation differs from the equation that derives a looming object's coordinates from its angular subtense, as used in a number of time-to-contact studies (e.g., Regan & Hamstra, 1993), within the depth range simulated in the present study—from 36.82 to 45.18 cm in front of the observer—the differences in size-change ratio between the two looming equations are trivial since they range from 0% to 1.16%. Following the speed in depth and distance in depth adopted in the stereomotion condition, the square underwent expansion with an area ratio of 1.71:1.00 per second, or underwent contraction with an area ratio of 0.48:1.00 per second, within a range of angular subtense of 4.16° to 5.10°. Two bars were also presented to denote zero disparity. In the condition of combined stereomotion and looming, the stimulus incorporated the features of both the looming stimulus and the stereomotion stimulus used in Experiment 2. As well as moving radially away from or toward the center of the square, corresponding dots in each of the two stereo images also moved horizontally in opposite directions at 1.39 deg s−1; i.e., the square appeared to move in depth at 12.54 cm s−1, as in Experiment 2
As described above, looming was simulated by changes in the size of a square aperture and by the positions of the dots that defined the square. The size changes were calculated in accordance with cyclopean perspective projection (projection to an imaginary point between the two eyes), and dot size remained constant over time. In real-life looming, the pair of retinal images transforms according to binocular perspective projection and local elements change in size, along with changes in element density and object size. These differences from natural imagery, and the conflict between the cues of dot density and dot size, may compromise the validity of our looming stimulus. To investigate this possibility, we conducted a supplementary looming condition under monocular viewing with an object that does not undergo change in image texture when it looms. Instead of a dot-defined square, a solid square (with similar dimensions) defined by uniform luminance (132 cd m −2) was used (see Figure 4). To ensure smooth transformation of the stimulus, an anti-aliasing procedure was employed to define its edges. The dimensions of the square and its expansion and contraction rates were the same in both looming conditions. The display was viewed monocularly while observers wore an eye patch over their non-preferred eye. 
Figure 4
 
A sample frame of the looming-solid-square condition from Experiment 3. The motion in depth of a luminance square was specified by looming. Halfway through each motion sequence, a Gaussian blob was flashed in the center of the square. Under monocular viewing, observers were required to match the size of blobs presented in opposite directions of context-motion in depth.
Figure 4
 
A sample frame of the looming-solid-square condition from Experiment 3. The motion in depth of a luminance square was specified by looming. Halfway through each motion sequence, a Gaussian blob was flashed in the center of the square. Under monocular viewing, observers were required to match the size of blobs presented in opposite directions of context-motion in depth.
Results
Average adjustments made to the size of the Gaussian blob are separately plotted for approaching and receding motion for the three conditions ( Figures 5B5D). For comparison, the data obtained with stereomotion alone in Experiment 2 ( Figure 5A) are also plotted alongside the data obtained in this experiment. The dashed line represents the reference size of the blob ( σ = 22.20 arcmin). The pattern of results is similar for all observers (with some individual variation) and a number of observations can be made. First, the data obtained in the condition of combined stereomotion and looming ( Figure 5B) and those obtained with stereomotion alone ( Figure 5A) are very similar, with observers making similar observations in the apparent size of the flashed stimulus. When the square was receding in depth, observers judged the size of the blob to be larger than when the square was approaching. Second, by comparison, when looming was the only cue used to define motion in depth, size effects, where recorded at all, were of much smaller magnitudes ( Figures 5C and 5D). Size adjustments, averaged across the five observers, are separately plotted for the four conditions ( Figure 6). In the looming dot-defined-square condition, the average adjusted standard deviation of the Gaussian is different from its reference value by 0.07 arcmin (and not significantly different from zero), in contrast to 0.52 arcmin and 0.59 arcmin in the conditions of stereomotion alone and combined stereomotion and looming, respectively. For the looming solid-square condition, the average adjusted standard deviation of the Gaussian differs from its reference value by 0.21 arcmin, which is considerably smaller than the results recorded in the conditions containing stereomotion. Importantly, this comparison rules out the likelihood that a reduced size effect with the looming dot-defined square is a direct consequence of cue conflict, as a similar pattern of results was noted with a more powerful looming stimulus. In general, the results obtained in this experiment show that looming and stereomotion give rise to different size percepts of a nearby flashed object. 
Figure 5
 
Size adjustments of the Gaussian blob are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Motion in depth was specified by (A) stereomotion, (B) stereomotion and looming, (C) looming produced with a binocularly viewed dot-defined square, or (D) looming produced with a monocularly viewed solid square.
Figure 5
 
Size adjustments of the Gaussian blob are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Motion in depth was specified by (A) stereomotion, (B) stereomotion and looming, (C) looming produced with a binocularly viewed dot-defined square, or (D) looming produced with a monocularly viewed solid square.
Figure 6
 
Size adjustments of the Gaussian blob, averaged for five observers, are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Motion in depth was specified by stereomotion, stereomotion and looming, looming produced with a binocularly viewed dot-defined square, or looming produced with a monocularly viewed solid square.
Figure 6
 
Size adjustments of the Gaussian blob, averaged for five observers, are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Motion in depth was specified by stereomotion, stereomotion and looming, looming produced with a binocularly viewed dot-defined square, or looming produced with a monocularly viewed solid square.
General discussion
We have demonstrated that a change in the apparent size of a flashed object, reducing or increasing according to the direction of motion in depth of a surrounding context, accompanies the perception of flash lag in depth. The size effect is substantially larger when stereomotion, rather than looming, is used to define motion in depth. These results may provide some insight into the nature of 3D perceptual space constructed by the human visual system. 
Linares and López-Moliner (2007) reported an absence of the 2D flash-lag effect when observers were required to detect a global pattern composed of flashed and moving dots, and they suggested that a position-judgment task is a necessary condition for the flash-lag effect. Here, we show that the distortion of perceptual space between the flashed object and the moving object is not only reflected in apparent position, but also in apparent size. In Experiment 2 we found that approaching motion induces an apparent decrease in the image size of the flashed object, whereas an apparent increase in its image size is induced by receding motion. This result appears to be the opposite of that predicted by the principle of size constancy implied by Emmert's law (1881, in Boring, 1940). As indicated by published results (Harris et al., 2006; Ishii et al., 2004), and from the result of Experiment 1, the flashed object appears to be more distant from the observer than the moving object with approaching motion but closer to the observer than the moving object with receding motion. By inverse projection, an identical retinal size may correspond to a flashed object of a larger physical size in the context of approaching motion and of a smaller physical size in the context of receding motion, if in the context of approaching motion the flash-lagged object is perceived as further away from the observer and in the context of receding motion the flash-lagged object is perceived as closer to the observer. By this account, which has been used to explain a number of classic size illusions (see Ames, 1952; Gregory, 1966; Rock & Kaufman, 1962), the size of an object is estimated by scaling its retinal size and apparent distance from the observer. If the moving object is only perceived as being further away or closer to the observer, then no size difference of the flashed object would be expected in the context of approaching and receding motion. Our results show a strong size difference, but in opposite direction of what would be expected. 
Our results may be explained in the following way. In the approaching stereomotion conditions, the flashed object, despite being perceived as behind the moving object, takes the moving object as its “background” context (there being no other) and is thus “imaged” against a “background” that is closer to the observer and is induced to appear to be smaller. In this sense, Emmert's law is obeyed. It is as if the flashed object's actual or apparent stereo-depth value plays no role in size scaling, and thus the object acts merely as an object of fixed retinal size in the context of a closer or more distant reference “background.” In size perception, it behaves much as the familiar retinal afterimage of, say, an electronic flashlight which appears to be large when in the context of a background that is far away and appears to be small when in the context of a background that is close. No such conflict occurs when the moving stimulus is cued by looming; and there is no, or a minimal, size effect. This explanation implies an interesting conflict of cues, at least for an object that appears briefly, and perhaps more importantly, our results imply considerable independence between depth cues of looming and stereo disparity, and stereo disparity is thus, perhaps, the less reliable cue; again at least for an object that appears briefly (see, Regan & Beverley, 1979; Regan & Gray, 2000; Rushton & Wann, 1999; for discussion of models that combine the two cues in accordance with simple weighted-sum operation, in which weighting is adjusted according to the reliability of each cue). 
Two recent studies have shown that the veridical position of the flashed and moving object is perceived when the moving object is enclosed by the flashed object (Kanai & Verstraten, 2006) or when the spatial relationships between a number of flashed and moving objects are used to define a global shape (Linares & López-Moliner, 2007). These findings suggest that the displaced position perceived in 2D flash lag is represented in higher visual areas, such as the middle temporal cortex (V5/MT), where activity has been shown to correlate with the interaction of motion and position information (McGraw, Walsh, & Barrett, 2004). For the flash-lag-in-depth effect, while the position percept of the flashed object needs to be represented at cortical levels tuned for binocular disparity (see DeAngelis, Ohzawa, & Freeman, 1991; Ohzawa, DeAngelis, & Freeman, 1990), the neural correlate of its size percept remains open to investigation. It is possible that the visual system derives the size estimate of the flashed object from its illusory position in depth, and that the perceived size of the flashed object is represented in the primary visual cortex (V1), which is in turn modulated by feedback from higher visual area that encodes position in depth. This proposal is supported by the findings of Murray, Boyaci, and Kersten (2006), who reported that the area of activation corresponding to a fixed-size object in V1 changes according to its apparent image size implied by static contextual depth cues. 
The finding from Experiment 3 that contextual stereomotion, compared to contextual looming, induces a much more substantial distortion in apparent image size is counterintuitive since looming is systematic change in image size, and contextual size has been suggested to play a crucial role in size perception (Ebbinghaus, 1885). We obtained similar data with a looming solid square, and this stimulus may provide a stronger looming cue by eliminating the conflicting cues of cyclopean perspective projection and constant dot size. These findings, taken together, suggest that the size illusion reported here cannot be explained by size contrast in the image plane, and is largely contingent on the perception of flash lag in depth. The limited effect of looming on size perception associated with flash lag in depth perhaps stems from the fact that the motion-in-depth sensation from looming does not arise at the local level but requires a higher-level interpretation of changing size at the global level (see Beverley & Regan, 1979), and at a given moment in the course of looming there is scant information on position in depth implied by 2D size. By contrast, stereomotion conveys a sensation of motion in depth explicitly as a sequence of in-depth positions specified by disparity and thus is a more useful cue to the computation of relative sizes and positions in 3D space. 
Since Nijhawan (1994) rediscovered the flash-lag effect and interpreted it as a result of the visual system extrapolating the moving object's trajectory to compensate for neural delays, a number of alternative models have been proposed to account for the perceptual lag. Influential proposals include differential latencies involved in the processing of the flashed and moving objects (Baldo & Klein, 1995; Purushothaman, Patel, Bedell, & Ogmen, 1998; Whitney & Murakami, 1998), averaging of the moving object's positions along its trajectory after the presentation of the flashed object (Eagleman & Sejnowski, 2000; Krekelberg & Lappe, 2000), and extra time required to sample the moving object's position in response to the flashed object (Brenner & Smeets, 2000). Despite the differences among these proposals, nearly all of which suggest that, to a certain extent, the flash-lag effect is the manifestation of the biases involved in mapping spatiotemporal changes in the visual field, and explanations in terms of the activities of visuotopically organized neuronal network have been recently proposed (Baldo & Caticha, 2005; Kanai, Sheth, & Shimojo, 2004). In the present study, the attribute measured—image size—did not undergo physical change over time in the stereomotion condition when illusory change in image size occurred. This finding suggests that the flash-lag effect gives rise to a distortion in perceptual space that cannot be completely explained in terms of temporal changes in visuotopical mapping. 
Acknowledgments
We gratefully acknowledge the perceptive comments of two anonymous reviewers. The work was supported by grant HKU7426/05H to S. K. Khuu and A. Hayes and grant HKU7409/06H to S. K. Khuu from the Research Council of Hong Kong and by an E.T.S. Walton award from Science Foundation Ireland to A. Hayes. 
Commercial relationships: none 
Corresponding author: Terence C. P. Lee. 
Email: terencel@hkbu.edu.hk. 
Address: Psychology Unit, Hong Kong Baptist University, Shek Mun, Shatin, Hong Kong SAR, China. 
References
Ames, A. (1952). The Ames demonstrations in perception. New York: Hafner Publishing.
Baldo, M. V. Caticha, N. (2005). Computational neurobiology of the flash-lag effect. Vision Research, 45, 2620–2630. [PubMed] [CrossRef] [PubMed]
Baldo, M. V. Klein, S. A. (1995). Extrapolation or attention shift? Nature, 378, 565–566. [PubMed] [CrossRef] [PubMed]
Beverley, K. I. Regan, D. (1979). Separable aftereffects of changing-size and motion-in-depth: Different neural mechanisms? Vision Research, 19, 727–732. [PubMed] [CrossRef] [PubMed]
Boring, E. G. (1940). Size constancy and Emmert's law. American Journal of Psychology, 53, 293–295. [CrossRef]
Bradshaw, M. F. Parton, A. D. Glennerster, A. (2000). The task-dependent use of binocular disparity and motion parallax information. Vision Research, 40, 3725–3734. [PubMed] [CrossRef] [PubMed]
Brenner, E. Smeets, J. B. (2000). Motion extrapolation is not responsible for the flash-lag effect. Vision Research, 40, 1645–1648. [PubMed] [CrossRef] [PubMed]
DeAngelis, G. C. Ohzawa, I. Freeman, R. D. (1991). Depth is encoded in the visual cortex by a specialized receptive field structure. Nature, 352, 156–159. [PubMed] [CrossRef] [PubMed]
De Valois, R. L. De Valois, K. K. (1991). Vernier acuity with stationary moving Gabors. Vision Research, 31, 1619–1626. [PubMed] [CrossRef] [PubMed]
Eagleman, D. M. Sejnowski, T. J. (2000). Motion integration and postdiction in visual awareness. Science, 287, 2036–2038. [PubMed] [CrossRef] [PubMed]
Ebbinghaus, H. (1885). On memory. Leipzig: Duncker & Humblot.
Freyd, J. J. Finke, R. A. (1984). Facilitation of length discrimination using real and imaged context frames. American Journal of Psychology, 97, 323–341. [PubMed] [CrossRef] [PubMed]
Fröhlich, F. W. (1923). Über die Messung der Empfindungszeit. Zeitschrift für Sinnesphysiologie, 54, 58–78.
Gray, R. Macuga, K. Regan, D. (2004). Long range interactions between object-motion and self-motion in the perception of movement in depth. Vision Research, 44, 179–195. [PubMed] [CrossRef] [PubMed]
Gray, R. Regan, D. (1998). Accuracy of estimating time to collision using binocular and monocular information. Vision Research, 38, 499–512. [PubMed] [CrossRef] [PubMed]
Gregory, R. L. (1966). Eye and brain: The psychology of seeing. New York: McGraw-Hill.
Harris, L. R. Duke, P. A. Kopinska, A. (2006). Flash lag in depth. Vision Research, 46, 2735–2742. [PubMed] [CrossRef] [PubMed]
Hayes, A. (2000). Apparent position governs contour-element binding by the visual system. Proceedings of the Royal Society B: Biological Sciences, 267, 1341–1345. [PubMed] [Article] [CrossRef]
Heuer, H. (1993). Estimates of time to contact based on changing size and changing target vergence. Perception, 22, 549–563. [PubMed] [CrossRef] [PubMed]
Hillis, J. M. Ernst, M. O. Banks, M. S. Landy, M. S. (2002). Combining sensory information: Mandatory fusion within, but not between, senses. Science, 298, 1627–1630. [PubMed] [CrossRef] [PubMed]
Ishii, M. Seekkuarachchi, H. Tamura, H. Tang, Z. (2004). 3D flash lag illusion. Vision Research, 44, 1981–1984. [PubMed] [CrossRef] [PubMed]
Kanai, R. Sheth, B. R. Shimojo, S. (2004). Stopping the motion and sleuthing the flash-lag effect: Spatial uncertainty is the key to perceptual mislocalization. Vision Research, 44, 2605–2619. [PubMed] [CrossRef] [PubMed]
Kanai, R. Verstraten, F. A. (2006). Visual transients reveal the veridical position of a moving object. Perception, 35, 453–460. [PubMed] [CrossRef] [PubMed]
Krekelberg, B. Lappe, M. (2000). A model of the perceived relative positions of moving objects based upon a slow averaging process. Vision Research, 40, 201–215. [PubMed] [CrossRef] [PubMed]
Linares, D. López-Moliner, J. (2007). Absence of flash-lag when judging global shape from local positions. Vision Research, 47, 357–362. [PubMed] [CrossRef] [PubMed]
Mackay, D. M. (1958). Perceptual stability of a stroboscopically lit visual field containing self-luminous objects. Nature, 181, 507–508. [PubMed] [CrossRef] [PubMed]
McGraw, P. V. Walsh, V. Barrett, B. T. (2004). Motion-sensitive neurones in V5/MT modulate perceived spatial position. Current Biology, 14, 1090–1093. [PubMed] [Article] [CrossRef] [PubMed]
Murray, S. O. Boyaci, H. Kersten, D. (2006). The representation of perceived angular size in human primary visual cortex. Nature Neuroscience, 9, 429–434. [PubMed] [CrossRef] [PubMed]
Nijhawan, R. (1994). Motion extrapolation in catching. Nature, 370, 256–257. [PubMed]
Ohzawa, I. DeAngelis, G. C. Freeman, R. D. (1990). Stereoscopic depth discrimination in the visual cortex: Neurons ideally suited as disparity detectors. Science, 249, 1037–1041. [PubMed] [CrossRef] [PubMed]
Purushothaman, G. Patel, S. S. Bedell, H. E. Ogmen, H. (1998). Moving ahead through differential visual latency. Nature, 396, 424. [CrossRef] [PubMed]
Regan, D. Beverley, K. I. (1979). Binocular and monocular stimuli for motion in depth: Changing-disparity and changing-size feed the same motion-in-depth stage. Vision Research, 19, 1331–1342. [PubMed] [CrossRef] [PubMed]
Regan, I. I. Gray, I. I. (2000). Visually guided collision avoidance and collision achievement. Trends in Cognitive Sciences, 4, 99–107. [PubMed] [CrossRef] [PubMed]
Regan, D. Hamstra, S. J. (1993). Dissociation of discrimination thresholds for time to contact and for rate of angular expansion. Vision Research, 33, 447–462. [PubMed] [CrossRef] [PubMed]
Rock, I. Kaufman, L. (1962). The moon illusion, II: The moon's apparent size is a function of the presence or absence of terrain. Science, 136, 1023–1031. [PubMed] [CrossRef] [PubMed]
Rushton, S. K. Wann, J. P. (1999). Weighted combination of size and disparity: A computational model for timing a ball catch. Nature Neuroscience, 2, 186–190. [PubMed] [CrossRef] [PubMed]
Savelsbergh, G. J. Whiting, H. T. Bootsma, R. J. (1991). Grasping tau. Journal of Experimental Psychology: Human Perception and Performance, 17, 315–322. [PubMed] [CrossRef] [PubMed]
Whitney, D. Murakami, I. (1998). Latency difference, not spatial extrapolation. Nature Neuroscience, 1, 656–657. [PubMed] [Article] [CrossRef] [PubMed]
Brenner, E. van Damme, W. J. (1999). Perceived distance, shape and size. Vision Research, 39, 975–986. [PubMed] [CrossRef] [PubMed]
Figure 1
 
(A) The setup used in Experiment 1. A dot-defined square moved toward or away from the observer, with motion in depth specified by stereomotion. Halfway through every motion sequence, a Gaussian blob was flashed in the center of the square. Observers were required to align the position in depth of the blob with the position in depth of the square. (B) An illustration of the setup used in Experiment 2. Observers were required to adjust so to match the size of blobs presented in the context of opposite directions of motion in depth of the square.
Figure 1
 
(A) The setup used in Experiment 1. A dot-defined square moved toward or away from the observer, with motion in depth specified by stereomotion. Halfway through every motion sequence, a Gaussian blob was flashed in the center of the square. Observers were required to align the position in depth of the blob with the position in depth of the square. (B) An illustration of the setup used in Experiment 2. Observers were required to adjust so to match the size of blobs presented in the context of opposite directions of motion in depth of the square.
Figure 2
 
Position-in-depth adjustments of the Gaussian blob are plotted as offsets relative to the physically aligned position in depth and as a function of speed in depth of the dot-defined square. Error bars represent ±1 SEM.
Figure 2
 
Position-in-depth adjustments of the Gaussian blob are plotted as offsets relative to the physically aligned position in depth and as a function of speed in depth of the dot-defined square. Error bars represent ±1 SEM.
Figure 3
 
Size adjustments of the Gaussian blob are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Error bars represent ±1 SEM. The dashed line shows the physical size of the blob ( σ = 22.20 arcmin).
Figure 3
 
Size adjustments of the Gaussian blob are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Error bars represent ±1 SEM. The dashed line shows the physical size of the blob ( σ = 22.20 arcmin).
Figure 4
 
A sample frame of the looming-solid-square condition from Experiment 3. The motion in depth of a luminance square was specified by looming. Halfway through each motion sequence, a Gaussian blob was flashed in the center of the square. Under monocular viewing, observers were required to match the size of blobs presented in opposite directions of context-motion in depth.
Figure 4
 
A sample frame of the looming-solid-square condition from Experiment 3. The motion in depth of a luminance square was specified by looming. Halfway through each motion sequence, a Gaussian blob was flashed in the center of the square. Under monocular viewing, observers were required to match the size of blobs presented in opposite directions of context-motion in depth.
Figure 5
 
Size adjustments of the Gaussian blob are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Motion in depth was specified by (A) stereomotion, (B) stereomotion and looming, (C) looming produced with a binocularly viewed dot-defined square, or (D) looming produced with a monocularly viewed solid square.
Figure 5
 
Size adjustments of the Gaussian blob are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Motion in depth was specified by (A) stereomotion, (B) stereomotion and looming, (C) looming produced with a binocularly viewed dot-defined square, or (D) looming produced with a monocularly viewed solid square.
Figure 6
 
Size adjustments of the Gaussian blob, averaged for five observers, are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Motion in depth was specified by stereomotion, stereomotion and looming, looming produced with a binocularly viewed dot-defined square, or looming produced with a monocularly viewed solid square.
Figure 6
 
Size adjustments of the Gaussian blob, averaged for five observers, are plotted as standard deviations of the Gaussian, with data collected for approaching and receding context-motion plotted separately. Motion in depth was specified by stereomotion, stereomotion and looming, looming produced with a binocularly viewed dot-defined square, or looming produced with a monocularly viewed solid square.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×