**Abstract**:

**Abstract**
**The fine temporal structure of events influences the spatial grouping and segmentation of visual-scene elements. Although adjacent regions flickering asynchronously at high temporal frequencies appear identical, the visual system signals a boundary between them. These “phantom contours” disappear when the gap between regions exceeds a critical value ( g_{max}). We used g_{max} as an index of neuronal receptive-field size to compare with known receptive-field data from along the visual pathway and thus infer the location of the mechanism responsible for fast temporal segmentation. Observers viewed a circular stimulus reversing in luminance contrast at 20 Hz for 500 ms. A gap of constant retinal eccentricity segmented each stimulus quadrant; on each trial, participants identified a target quadrant containing counterphasing inner and outer segments. Through varying the gap width, g_{max} was determined at a range of retinal eccentricities. We found that g_{max} increased from 0.3° to 0.8° for eccentricities from 2° to 12°. These values correspond to receptive-field diameters of neurons in primary visual cortex that have been reported in single-cell and fMRI studies and are consistent with the spatial limitations of motion detection. In a further experiment, we found that modulation sensitivity depended critically on the length of the contour and could be predicted by a simple model of spatial summation in early cortical neurons. The results suggest that temporal segmentation is achieved by neurons at the earliest cortical stages of visual processing, most likely in primary visual cortex.**

*phantom contour*.

*r*> 0.99).

*L*

_{0}= 55.0 cd/m

^{2}). Stimuli were viewed through a centered circular aperture in the card subtending 40.6°. Observers used a chin rest to maintain a viewing distance of 380 mm from the face of the monitor.

*L*

_{0}. The circle was partitioned into 70° temporal (T), superior (S), nasal (N), and inferior (I) quadrants separated by a 20° sector gap of fixed luminance

*L*

_{0}. Each quadrant was further divided into an inner and outer segment by an annulus of fixed luminance

*L*

_{0}, the width and radius of which varied between trials. A static circular fixation marker subtending 0.4° and containing light and dark quadrants of 75% contrast was continuously visible throughout each experimental block.

*target segment*) was modulated 180° out of phase with all other segments. The two halves of a raised cosine envelope were applied to the first and last 125 ms of each 500 ms presentation so that contrast was ramped gradually on and off. The luminance of the target segment was thus given by where

*L*

_{0}is the luminance of the surround,

*c*is the Michelson contrast (Michelson, 1927),

*f*is the frequency of the sinusoidal modulation (20 Hz),

_{t}*t*is the time in seconds, and

*φ*(

*t*) is the temporal envelope The luminance of all other segments was given by

*E*= 2°, 4°, 6°, 8°, 10°, and 12°) in each of the temporal (T), superior (S), nasal (N), and inferior (I) visual-field quadrants, with no gap between inner and outer segments. For each trial, presentation contrast was set using a ZEST adaptive staircase (King-Smith, Grigsby, Vingrys, Benes, & Supowit, 1994; Watson & Pelli, 1983) with a 72% correct threshold criterion. On completion of a staircase, contrast threshold was calculated as the mean of the posterior probability density function (pdf).

*g*

_{max}. This was measured at a range of retinal eccentricities while controlling for the variation in sensitivity across the visual field that was assessed in Experiment 1.

*E*= 2° to 12° in increments of 1°), 11 gap widths scaled with eccentricity (

*g*= 0° to 0.2

*E*° in increments of 0.02

*E*°), and four target quadrants (I, S, T, and N).

*E*= 3°, 5°, 7°, 9°, and 11°) were calculated by linear interpolation. In the

*low-contrast condition*, contrast was raised by 0.04 log units above threshold to yield expected baseline (

*g*= 0°) performance of approximately 80% correct; in the

*high-contrast condition*, contrast was raised by 0.10 log units above threshold to yield expected baseline performance of approximately 95% correct. The baseline conditions were included to verify performance levels predicted from the results of Experiment 1; they were omitted for observers OC and PG in the low-contrast condition.

*d*′) was derived from the percentage of correct responses (Hacker & Ratcliff, 1979). At each eccentricity, we used a least-squares algorithm to model sensitivity as a function of gap width using the complementary Gaussian error function where

*α*and

*β*are the amplitude and standard-deviation parameters, respectively, and erf(

*x*) is the Gaussian error function

*r*

^{2}values above 0.80 in all but one case (mean

*r*

^{2}= 0.95,

*SD*= 0.04). Figure 3 shows, for a single typical observer, sensitivity as a function of gap width at each eccentricity tested. Sensitivity declined with increasing gap width at all eccentricities. The slope of the function was shallower for lower eccentricities and steeper for higher eccentricities.

*g*

_{max}as a function of eccentricity. For each eccentricity,

*g*

_{max}was calculated as the point at which the sensitivity curve fell to one-half of its maximum. The value was chosen because curves were relatively steep at this point; thus it was likely to be less volatile than lower values. As shown in Figure 4, values of

*g*

_{max}increased approximately monotonically with eccentricity, ranging from about 0.3° to 0.8° for eccentricities from 2° to 12°. Results showed little variation between observers and contrast conditions. The values obtained are consistent—both in absolute terms and in terms of relative variation with eccentricity—with RF diameters of neurons derived from single-cell recordings in V1 of awake, behaving macaques (Dow et al., 1981; Hubel & Wiesel, 1974).

*g*

_{max}as a function of retinal eccentricity. We pooled

*g*

_{max}values across all observers and contrast conditions and fitted a regression line using a least-squares procedure. The data were well described (

*r*

^{2}= 0.56) by the linear function where

*E*is retinal eccentricity in degrees. We then assessed, by visual inspection, the correspondence between the

*g*

_{max}fit line and functions describing RF diameters of neurons at different stages of the visual hierarchy. Dow et al. (1981) provide a function representing variation with eccentricity in the mean RF diameter of macaque V1 cells; we also derived functions from published data to describe the typical diameter of cell RFs in other cortical and subcortical visual areas.

*g*

_{max}data and regression line correspond closely to the V1 function. By comparison, they are inconsistent—both in slope and in magnitude—with the next closest functions describing RF diameters for lateral geniculate magnocell (LGN M-cell) centers and V2 cells. Thus our data are consistent with the proposal that

*g*

_{max}is limited by RF sizes in V1. We also note that the minimum separation between the outer segments of adjacent quadrants in our stimulus was considerably greater than

*g*

_{max}at all eccentricities tested. This suggests that sensitivity was mediated solely by the boundaries between inner and outer segments and was not influenced by the boundaries between quadrants.

*x*), and one dimension of time (

*t*). The spatiotemporal (

*x*–

*t*) profile comprises a step function in the

*x*dimension multiplied by a sinusoid in the

*t*dimension. The RF of an ideal detector will have a sensitivity profile that matches the stimulus (Watson, Barlow, & Robson, 1983). As noted by Forte et al. (1999), this ideal

*x*–

*t*profile is well approximated by linear spatiotemporal separable RFs such as those initially mapped in cat striate cortex (DeAngelis, Ohzawa, & Freeman, 1993a, 1993b, 1995). Such RFs are also present in V1 of macaques: Pack, Conway, Born, and Livingstone (2006, figure 12a), for example, present an RF in which the spatial (

*x*) profile reverses in polarity over approximately 25 ms. This interval is ideal for the detection of modulation at 20 Hz, the rate used in the current study.

*x–t*separable RFs suited to the detection of counterphasing modulation have been mapped in cat and macaque V1. If temporal segmentation is mediated in the first instance by approximately linear and independent units, we can make predictions about the summation of modulation energy across space. As the length of a counterphasing edge is increased from zero, we initially expect spatial summation to proceed in a linear manner as a result of physiological summation (sometimes called

*synaptic*or

*total summation*) within the RF of individual units. This would be reflected psychophysically by a rapid increase in modulation sensitivity (or, conversely, a rapid decrease in contrast threshold) with increasing contour length. As contour length is increased beyond the spatial extent of single RFs, we expect probability summation across space. That is, the probability of detecting the contour increases with contour length, owing to an increase in the number of independent mechanisms capable of responding. This would be reflected psychophysically by a more gradual increase in contrast sensitivity (or a gradual decrease in contrast threshold) with increasing contour length.

*E*= 2°, 4°, 6°, 8°, 10°, and 12°) for 12 different lengths of contour scaled by eccentricity (

*l*= 0°, 10

^{−1.5}

*E*°, 10

^{−1.35}

*E*°, 10

^{−1.2}

*E*°, 10

^{−1.05}

*E*°, 10

^{−0.9}

*E*°, 10

^{−0.75}

*E*°, 10

^{−0.6}

*E*°, 10

^{−0.45}

*E*°, 10

^{−0.3}

*E*°, 10

^{−0.15}

*E*°, and

*E*°). Each stimulus quadrant comprised a 70° sector (as in Experiments 1 and 2), with contour length manipulated by obscuring the ends of each quadrant border with two wide arcs of luminance

*L*

_{0}(see Figure 6). This arrangement allowed contour length to vary while keeping the total amount of modulation in the stimulus relatively constant.

*σ*), and the second represents absolute sensitivity of a unit (

*A*). The fitting procedure finds the values of

*σ*and

*A*that best account for the data describing contrast threshold as a function of contour length. Because stimuli were presented with a fixed duration (500 ms), we do not here attempt to estimate the temporal extent of the weighting function. Instead, we assume that each unit integrates across the full presentation time. This assumption, however, only impacts the absolute-sensitivity parameter of the model, not the size parameter that is the focus of the study. The same applies to our choice of stimulus duration: Changing the presentation time could affect estimates of absolute sensitivity, but not of size, for the units in the model.

*l*= 0° was impossible, demonstrating that the width of the masking arcs in the stimulus was sufficiently large to prevent discrimination of phase across the gap. (Because contrast threshold is undefined, these data points have been omitted from Figure 7). Absolute thresholds decreased with eccentricity, although this may be a result of scaling the contour lengths tested.

*x–t*separable simple cells) might mediate fast temporal segmentation (see the Appendix for details). We fitted the model to the data describing contrast threshold as a function of contour length to derive estimates of RF size for each observer at each eccentricity. As can be seen in Figure 8, RF size estimates derived from the model increase approximately monotonically with eccentricity. Pooling across all observers, the linear function best fitting the data (

*r*

^{2}= 0.59) is where

*E*is the retinal eccentricity in degrees. Visual comparison of our data to functions describing RF diameters of neurons in subcortical and cortical visual areas shows that they are closest to the V1 function at all eccentricities. As in Experiment 2, these data are consistent with the notion that temporal segmentation is performed by neurons in V1.

*g*

_{max}), defined as the gap width at which sensitivity fell to half of that observed with no gap between regions, increased monotonically with eccentricity. The values of

*g*

_{max}corresponded closely to typical receptive-field diameters of macaque V1 neurons reported in the literature. This finding is consistent with the proposal that temporal-phase segmentation is moderated by neurons that detect modulation energy within their receptive field. According to this model,

*g*

_{max}indicates the separation at which the counterphasing regions fall beyond the spatial range of neurons capable of signaling the modulation.

*g*

_{max}reported here accord with the spatial limitations reported by the two groups that have explicitly investigated counterphasing luminance modulation. Although they did not systematically vary the separation between regions, Rogers-Ramachandran and Ramachandran (1998) observed that a gap of 0.75° prevented the appearance of phantom contours. Their stimulus was presented centrally and subtended 13.4° × 15.4°; assuming a neuronal segmentation mechanism, the largest receptive fields responding to the edge would be those nearest the periphery of the stimulus at eccentricities of 6.7° and 7.7°. From the function derived in the present study, the corresponding values of

*g*

_{max}are 0.66° and 0.70°, respectively. While the separation of 0.75° is beyond

*g*

_{max}, some residual sensitivity might nevertheless be expected. However, their display comprised modulating fields of spots, to which observers are somewhat less sensitive than to the solid-element stimuli used in our study (Forte et al., 1999).

*g*

_{max}of 0.59°, though it is close to the range of

*g*

_{max}values for individual observers in our study (from 0.46° to 0.71° at 5° eccentricity). For the Forte et al. stimulus, typical RF sizes varied across the length of the contour. As the gap between quadrants approaches a critical width, sensitivity is mediated only by those neurons with the largest RFs, which are situated near the outermost edge of the contour. This decreases the effective length of the contour and thus reduces sensitivity. In comparison, for the stimulus used in the current study, typical RF size is constant across the length of the contour. Thus even as the gap between segments approaches a critical width, neurons with RFs situated along the full length of the contour will continue to mediate sensitivity. This difference could easily account for the small discrepancy between their findings and those of the present study.

_{max}and local-motion detection

*d*

_{max}.

*d*

_{max}) to exhibit a similar scaling with eccentricity as the spatial limitations of temporal-phase segmentation (

*g*

_{max}). As local-motion detectors receive inputs from a number of spatiotemporal separable subunits, their combination is likely to yield an aggregate receptive-field size somewhat greater than that of the individual subunits. Accordingly, at any eccentricity, we expect

*d*

_{max}to exceed

*g*

_{max}but remain within the range of RF sizes associated with early cortical neurons. Indeed, this appears to be the case. Baker and Braddick (1985) measured critical displacement for short-range motion at retinal eccentricities from 0.4° to 10° and found

*d*

_{max}to scale linearly from 0.1° to 1.5° over this range. These values are similar to

*g*

_{max}at lower retinal eccentricities and up to 0.7° greater than

*g*

_{max}in the periphery. The difference in gradient may be due to the recruitment of more subunits or a greater degree of RF center scatter in the peripheral field compared to the central field.

*Journal of the Optical Society of America A*, 2, 803–812. [PubMed] [CrossRef]

*Science*, 286 (5448), 2231. [CrossRef]

*Vision Research*, 27 (4), 621–635. [PubMed] [CrossRef] [PubMed]

*Journal of the Optical Society of America A*, 8 (8), 1330–1339. [PubMed] [CrossRef]

*Vision Research*, 25, 803–812. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 14, 519–527. [PubMed] [CrossRef] [PubMed]

*Philosophical Transactions of the Royal Society, B: Biological Sciences*, 290, 137–151. [PubMed] [CrossRef]

*Spatial Vision*

*,*10

*,*433–436. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 35 (1), 7–24. [PubMed] [CrossRef] [PubMed]

*Trends in Neurosciences*, 18 (10), 451–458. [PubMed] [CrossRef] [PubMed]

*Journal of Physiology*

*,*357

*,*219–240. [PubMed] [CrossRef] [PubMed]

*NeuroImage*, 39 (2), 647–660. [PubMed] [CrossRef] [PubMed]

*Proceedings of the Royal Society of London: Series B, Biological Sciences*, 254 (1341), 199–203. [PubMed] [CrossRef]

*Nature Neuroscience*

*,*4

*,*875–876. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 39 (24), 4052–4061. [PubMed] [CrossRef] [PubMed]

*Journal of Comparative Neurology*, 201 (4), 519–539. [PubMed] [CrossRef] [PubMed]

*Spatial Vision*, 1 (2), 85–102. [PubMed] [CrossRef] [PubMed]

*Journal of Vision*, 8 (4): 15, 1–11, http://journalofvision.org/8/4/15, doi:10.1167/8.4.15. [PubMed] [Article] [CrossRef] [PubMed]

*Visual pattern analyzers*. Oxford: Clarendon Press.

*Vision Research*

*,*51

*,*1397–1430. [PubMed] [CrossRef] [PubMed]

*Perception and Psychophysics*

*,*26

*,*168–170. [CrossRef]

*Vision Research*, 18 (4), 369–374. [PubMed] [CrossRef] [PubMed]

*Journal of Comparative Neurology*

*,*158 (3), 295–305. [PubMed] [CrossRef] [PubMed]

*Vision Research*

*,*34 (7), 885–912. [PubMed] [CrossRef] [PubMed]

*Science*, 248 (5417), 1165–1168. [PubMed] [CrossRef]

*Experientia*, 24, 348–350. [PubMed] [CrossRef] [PubMed]

*Journal of the Optical Society of America*

*,*70

*,*212–219. [CrossRef]

*Studies in optics*. Chicago: University of Chicago Press.

*Optica Acta*

*,*10

*,*187–191. [PubMed] [CrossRef] [PubMed]

*Acta Ophthalmologica Scandinavica Supplementum*

*,*6

*,*1–103. [CrossRef]

*Journal of Neuroscience*, 26 (3), 893–907. [PubMed] [CrossRef] [PubMed]

*Visual Cognition*, 13 (4), 481–502. [CrossRef]

*Spatial Vision*

*,*10

*,*437–442. [PubMed] [CrossRef] [PubMed]

*Vision Research*

*,*45 (8), 1075–1084. [PubMed] [CrossRef] [PubMed]

*Kybernetik*, 16 (2), 65–67. [PubMed] [CrossRef] [PubMed]

*Neural networks for vision and image processing*(pp. 46–91). Cambridge, MA: MIT Press.

*Bulletin of the Psychonomic Society*, 29 (5), 391–394. [CrossRef]

*Vision Research*, 21 (3), 409–418. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 38 (1), 71–77. [PubMed] [CrossRef] [PubMed]

*Psychological Science*

*,*12 (6), 437–444. [PubMed] [CrossRef] [PubMed]

*International Journal of Neuroscience*

*,*116

*,*315–320. [PubMed] [CrossRef] [PubMed]

*Cerebral Cortex*

*,*11

*,*1182–1190. [PubMed] [CrossRef] [PubMed]

*Neuropsychologia*

*,*41

*,*1422–1429. [PubMed] [CrossRef] [PubMed]

*Journal of the Optical Society of America A*

*,*4

*,*1612–1619. [PubMed] [CrossRef]

*Nature*

*,*394

*,*179–182. [PubMed] [CrossRef] [PubMed]

*Vision Research*, 42, 2063–2071. [PubMed] [CrossRef] [PubMed]

*Nature*, 302 (5907), 419–422. [PubMed] [CrossRef] [PubMed]

*Perception & Psychophysics*, 33 (2), 113–120. [PubMed] [CrossRef] [PubMed]

*Journal of Applied Mechanics*

*,*18

*,*292–297.

*Brain*, 106, 473–502. [PubMed] [CrossRef] [PubMed]

*Cerebral Cortex*, 17, 2293–2302. [PubMed] [CrossRef] [PubMed]

*x*

_{0}is the center and

*σ*is the standard deviation. The spatial profile—depicted in Figure 9—was the first-order derivative to

_{x}*x*of a two-dimensional Gaussian. The temporal profile was a sinusoid windowed by a Gaussian. The full spatiotemporal profile of the RFs was thus given by where

*f*is the temporal frequency at which the spatial profile reverses in polarity. We assume that for the ideal detector,

_{t}*f*is matched to the stimulus.

_{t}*l*is the length of the contour in the

*y*dimension,

*φ*(

*t*) is the temporal envelope (Equation 2), and

*H*(

_{a}*x*) is the Heaviside step function We assumed that the response at

*t*

_{0}of an RF centered at (

*x*

_{0},

*y*

_{0}) was given by the correlational integral This was calculated by multiplying the RF and the stimulus in the frequency domain, The output of each unit was then half-wave rectified and integrated across time,

*i*th unit to the probability

*P*of that unit detecting the stimulus, where

_{i}*C*is the stimulus contrast,

_{I}*A*is the absolute sensitivity of the unit, and

*β*is the slope parameter of the psychometric function. Probability summation was performed across space according to the Quick pooling formula (Graham, 1989; Quick, 1974), such that the observer's probability

*P*of detecting the stimulus is For a given probability of detection, contrast threshold

*C*is thus

_{t}*σ*and

_{x}*σ*in Equation 9 and Equation 10, employed here as a single parameter

_{y}*σ*, where

*σ*=

*σ*=

_{x}*σ*) and the absolute sensitivity of an individual unit (

_{y}*A*in Equations 15, 16, and 17). All units at a given eccentricity had the same values for each parameter. To estimate

*β*, we fitted a Weibull function to the psychometric data from each staircase by a least-squares procedure. For each observer and eccentricity, the mean

*β*of the 60 (12 contour lengths × 5 staircases) functions was employed in the model (for JH, mean

*β*= 4.60,

*SD*= 0.16; for LJ, mean

*β*= 4.54,

*SD*= 0.26; for PG, mean

*β*= 4.61,

*SD*= 0.10; and for XV, mean

*β*= 4.60,

*SD*= 0.21).

*σ*and

*A*at each eccentricity that best fit the data describing contrast threshold as a function of contour length. We used a nonlinear least-squares procedure based on the Matlab Optimization Toolbox function

*lsqnonlin*and performed each fit 25 times with randomized starting values. The algorithm successfully found reliable parameter estimates for 20 of the 24 functions; the mean 95% confidence interval on the estimate of

*σ*was 0.56° with a standard deviation of 0.38°. In four cases, 95% confidence intervals on the estimate of

*σ*exceeded 2°; these four data points (JH for

*E*= 6° and 12°, LJ for

*E*= 12°, and XV for 4°) were excluded from further analyses.

*g*

_{max}method, the model was also fitted to the data from Experiment 2 (see Figure 3) according to the same procedure. We calculated the probability of a correct response as where

*P*is the probability of detecting the stimulus according to Equation 16. The

*β*parameter was fixed at 5.20, which was the mean slope of all Weibull functions fitted to the data from each staircase in Experiment 1. Proportions correct were converted to

*d*′, such that the fitting procedure optimized the parameters

*σ*and

*A*by minimizing the least-squares error of

*d*′ values predicted by the model, compared to observed

*d*′ values.