**Abstract**:

**Abstract**
**A number of studies have investigated how the visual system extracts the average feature-value of an ensemble of simultaneously or sequentially delivered stimuli. In this study we model these two processes within the unitary framework of linear systems theory. The specific feature value used in this investigation is size, which we define as the logarithm of a circle's diameter. Within each ensemble, sizes were drawn from a normal distribution. Average size discrimination was measured using ensembles of one and eight circles. These circles were presented simultaneously (display times: 13–427 ms), one at a time, or eight at a time (temporal-frequencies: 1.2–38 Hz). Thresholds for eight-item ensembles were lower than thresholds for one-item ensembles. Thresholds decreased by a factor of 1.3 for a 3,200% increase in display time, and decreased by the same factor for a 3,200% decrease in temporal frequency. Modeling and simulations show that the data are consistent with one readout of three to four items every 210 ms.**

^{2}) or black (0.05 cd/m

^{2}) circle outlines (five-pixel width) presented on a gray background (22 cd/m

^{2}). Their polarity was changed systematically across trials to avoid possibly confounding stimulus exposure with luminance adaptation. Their contrast was such that even under the shortest display durations and the highest temporal frequencies they were highly suprathreshold. A white central cross was used for fixation. The screen was partitioned into two hemifields by a black, three-pixels thick vertical line.

*T*: 13.3, 26.7, 53.3, 106.6, 217.3, and 426.5 ms). When repeated they were refreshed at one of six Temporal Frequencies (TF: 1.17, 2.3, 4.7, 9.4, 18.8, and 37.5 Hz) with a 0.5 duty-cycle. Eight temporal cycles were always presented so that the total duration of a flickering trial depended on TF. The TFs were chosen so that the duration of one-half temporal cycle at the highest frequency equaled the shortest once-per-trial condition. At the lowest frequency it equaled the longest durations used in the once-per-trial condition. When presented only once per trial, the number of circles per hemifield (

*N*

_{s}) was either 1 or 8. When presented repeatedly in one trial each temporal cycle also displayed 1 or 8 elements. Hence, there were 2 × 2 experimental conditions, hereafter referred to as 1:1, 1:8, 8:1, and 8:8 (i.e.,

*N*

_{s}:

*N*

_{t}; see Appendix 1 for a list of all notations) where the first and second digits refer respectively to the number of simultaneously displayed circles and to the number of temporal cycles per trial (see Figure 1). In each hemifield all circles' diameters were randomly drawn from one of two lognormal distributions, either lnN[

*μ*-Δ

*μ*/2

*μ*+Δ

*μ*/2

^{1}In this experiment the parameter controlling stimulus variance was fixed at

*μ*was itself a random variable drawn across trials from a flat distribution, such that

*μ*∈ [1.1°, 2.7°]. The ratio Δ

*μ*/

*μ*was under the control of two staircases per experimental condition (see Procedure). One or eight circles' locations were randomized both across trials and temporal cycles. These locations were constrained such that (a) circles were always within a circular area around fixation with a 13° radius and (b) the outlines of the simultaneously presented circles were always at least 1° apart.

*μ*were randomized across trials. The participant's task was to decide which of the two hemifields contained the circle(s) with the largest mean size (a two alternative forced-choice paradigm). Participants indicated their response by pressing one of two keys. There was no feedback. The expected size difference between circles in the left and right hemifields was under the control of two interleaved staircases (accelerated stochastic approximation algorithm; Kesten, 1958) set to converge on a performance of 81% for each experimental condition so that there were 12 interleaved staircases per experiment and per session. Typically, each staircase converged after an average of about 25 trials. Five trials with Δ

*μ*/

*μ*well beyond the discrimination threshold (or just-noticeable Weber fraction) were randomly interspersed among each staircase trials to assess the percentage of lapses. Each participant first ran one training session with condition 1:8 (at least 120 trials). The four conditions were repeated four times in a random order so that each

*θ*was computed as the geometric mean of four assessments. The whole experiment was completed in about 3 hrs typically dispatched in two or three sessions per day.

*θ*× 100%) averaged over all six observers as a function of the display duration for conditions 1:1 (circles) and 8:1 (squares) (all symbols and notations are summarized in Appendix 1). Thresholds drop with duration but the slopes of the linear regressions (in log-log coordinates; straight lines) are very shallow, congruent with a statistical summation process (see the Modeling section): The slope for condition 1:1 is −0.03, which is not significantly different from 0 (

*F*= 2.08,

*p*= 0.158); the slope for condition 8:1 is −0.08, which is significantly different from 0 (

*F*= 4.67,

*p*= 0.037).

^{2}It should be pointed out that a threshold drop with duration is theoretically obligatory due to an inevitable reduction of early noise (see also The generalized NIO model section).

*θ*× 100]) averaged over the six observers as a function of the display TF for conditions 1:8 (circles) and 8:8 (squares). Here again, sensitivity barely varies with TF. When compared with the standard Temporal Modulation Transfer functions for contrast (Figure 3B) over the same TF range they show a maximum modulation of 0.23 log-units, while the maximum sensitivity modulation for low and high spatial frequencies is 2.5 and 2.8 log-units (pairs of red closed circles on the left and right smooth curves, respectively).

*F*[1, 71] = 0.98,

*p*= 0.32).

^{3}This is true independently of the rate (TF) at which size information is delivered (interaction:

*F*[5, 71] = 0.11,

*p*= 0.99).

*θ*× 100) of three observers (different symbols) for condition 8:1 and stimulus durations of 13 and 427 ms (open and solid symbols, respectively) as a function of the parameter controlling stimulus variance. This figure suggests that duration has a large effect on threshold when stimulus variance is low; it has little effect on threshold when stimulus variance is high.

^{−1}is the inverse standard normal distribution, (0.81 was the convergence point of the adaptive staircases),

*M*are free parameters. The first two are the variances of the early and late noises, respectively, and

*M*is the effective maximum number of circles used by observers to compute the mean of each array of

*N*

_{s}elements (

*M*≤

*N*

_{s}). A random perturbation with variance

*ρ*, then we can reparameterize Equation 1b such that

*M*. In these fits efficiency and precision were constrained to be nondecreasing with exposure duration. Thus we ensured

*M*

_{427ms}≥

*M*

_{13ms},

_{427ms}≤

_{13ms}, and

_{427ms}≤

_{13ms}. Like Solomon et al. (2011) we, too, found sizeable individual differences in efficiency, with observer SB effectively using 5.9 circles in his calculations and observer HV effectively using just 3.7 (see inset in Figure 5). Notably, however, the present data suggest virtually zero effect of exposure duration on efficiency (average

*M*= 5). On the other hand, exposure duration does seem to affect all observers' precision (either early noise, late noise, or both).

*m*circles in each hemifield occur within a putative “attentional loop”. Equation 1b can be considered a special case (see below) of this gNIO: where

*N*has a capital subscript, it denotes a variance (in squared units of what Solomon et al., 2011, call “effective size”); when it has a lower-case subscript, it denotes a number (i.e., of either elements or pairs of subarrays).

_{s}, the number of simultaneously visible elements within each of these subarrays; N

_{t}, the number of successively exposed subarrays on each side of the display;

*m*, the maximum effective sample size of each aforementioned independent estimate. As with Solomon et al. (2011), we consider a circle's effective size to be proportional to its diameter's logarithm. The remaining two symbols in Equation 2 are

*l*× N

_{t}independent estimates. Each estimate is based on up to 2 ×

*m*circles (

*m*on the left plus

*m*on the right). If fewer than 2 ×

*m*circles appear during the loop, then that loop's estimate is based on the number of circles that did appear. It doesn't matter whether these elements are there for the whole loop or not (the understanding being that, once they exceed the detection threshold, their sizes are instantaneously coded). Instead, the shorter the loop, the higher the

*best-fitting*

*N*

_{C}and

*N*

_{E}is developed in Appendix 2.

*l*) per subarray would be proportional to the duration of each subarray. Thus the full gNIO has four free parameters:

*m*,

*σ*

_{e},

*σ*

_{L}, and

*l*

_{13}. The first three parameters are defined in the preceding section. The fourth parameter is the number of loops during the shortest stimulus exposure (13.3 ms). The gNIO was fit to the geometric mean thresholds from the three observers (AG, HV, and SB) who participated in both the Main Experiment and the Noise Experiment. (The Main Experiment alone was insufficient to constrain all four parameters.) Best-fitting parameter values were:

*m*= 3.2,

*σ*

_{e}= 0.02,

*σ*

_{L}= 0.10, and l

_{13}= 0.0625. The RMS error between the model's predictions and 34 datum points (18 from the Main Experiment, omitting 8:8,

^{4}plus 16 from the Noise Experiment) was 0.05 log units. These fits and the corresponding datum points are shown in Figure 6. Note that 0.0625 loops per 13.3 ms yields a loop duration of 213 ms, i.e., about two loops for the present longest presentation duration.

*l*

_{13}= 0.0625, this is the point at which exactly one loop extends over the eight cycles. Below this frequency the gNIO is able to use more than

*m*elements from each hemifield. At 0.068 s per half-cycle (7 Hz), the number of loops during each cycle becomes 1/

*m*and thus the gNIO is able to use all the circles in its computations (see Equation A5 and related text in Appendix 2). This feature of the fits provides a strong constraint on the duration of each loop, such that there is a well-defined minimum in the function mapping parameter values to the RMS error (Figure 7). It is partially supported by the statistical analysis mentioned in , even though no such significant difference is observed when considering the data of all six observers (see Figure 4 and related analysis). This apparent discrepancy may be accounted for by the fact that the goodness-of-fit statistics (RMS) for our model fits decreases only mildly for loops longer than 200 ms so that our loop-duration estimate allows for some variability.

*provided*that these weights are applied to the

*effective*magnitudes of the attribute under consideration (e.g., luminance, contrast, size, shape, etc.). By effective we mean the physical value

*transduced*by the brain (frequently referred to as the psychophysical function, i.e., the function that expresses the relationship between the physical magnitude of a stimulus and the magnitude of the sensory response evoked by that stimulus; Fechner, 1858).

*p*= 0.04). (b) An even shallower slope (−0.03), marginally different from 0 (

*p*= 0.15), was observed for estimating the size of one single item over the same duration range. (c) Increasing the sample size from one to eight items increases size discrimination sensitivity (one/threshold) by an average factor of 1.4, while an ideal observer should have increased it by a factor of

*σ*

_{E}, and late,

*σ*

_{L}) noise and

*total*efficiency,

*M*. As this model does not include a stimulus duration parameter,

*σ*

_{E}and

*M*refer to the noise and efficiency over the whole inspection period. The fit of the NIO model did yield, as expected, a smaller

*σ*

_{E}for the longest (427 ms) than for the shortest (13 ms) stimulus duration (

*σ*

_{E,13ms}= 0.10,

*σ*

_{E,427ms}= 0.047; averaged over observers) but also a small unexpected

*σ*

_{L}drop over this same time span (from 0.11 to 0.08 when averaged over observers) even though these drops were not systematic across observers. The fits did show an effect of exposure duration on efficiency, but only for one of the three observers (see inset in Figure 5).

*m*. During such a loop the early noise

*N*

_{E}decreases with time but the subsample

*m*remains the same, with a new

*m*-subsample being drawn with replacement on each new loop. When best-fit to the data, the duration of each loop was 213 ms (i.e., ∼5 Hz), with an effective sample size per loop (

*m*) of 3.2 items. The gNIO model (Equation 2) fits best with 3 ≤

*m*≤ 4, definitely larger than 1 or 2 as suggested by Myczek and Simon's (2008) noiseless simulations. Of particular interest is the inferred 5-Hz loop frequency which is within the range of attentional sampling, as inferred by a number of authors from similar (1–8 Hz; Wyart, de Gardelle, Scholl, & Summerfield, 2012) and entirely different experiments (4–10 Hz; e.g., VanRullen, Carlson & Cavanagh, 2007; Busch & VanRullen, 2010; Macdonald, Cavanagh, & VanRullen, 2014).

*i*), with the total, time-dependent sensitivity given by the sensitivity summation rule,

*, 12 (11): 6, 1–12, http://www.journalofvision.org/content/12/11/6, doi:10.1167/12.11.6. [PubMed] [Article]*

*Journal of Vision**When can we say that subsampling of items is better than statistical summary representations?*, 70 (7), 1325–1326, discussion 1335–1336, doi:10.3758/PP.70.7.1325.

*Perception & Psychophysics**, 37 (2), 493–495.*

*Comptes Rendus de Séances de La Société de Biologie, Paris**, 113 (4), 700–765, doi:10.1037/0033-295X.113.4.700. [CrossRef] [PubMed]*

*Psychological Review**, 340 (6128), 95–98, doi:10.1126/science.1233912. [CrossRef] [PubMed]*

*Science**107 (37), 16048–16053, doi:10.1073/pnas.1004801107. [CrossRef]*

*Proceedings of the National Academy of Sciences, USA,**, 43 (4), 393–404. [CrossRef] [PubMed]*

*Vision Research**, 108 (32), 13341–13346, doi:10.1073/pnas.1104517108. [CrossRef]*

*Proceedings of the National Academy of Sciences, USA**, 18 (11), 935–950. [CrossRef]*

*Physica**. New York: Teachers College.*

*On memory: A contribution to experimental psychology**, 6, 457–532.*

*Memoirs of the Leipzig Society**, 5 (1), 10–16. [CrossRef] [PubMed]*

*Trends in Cognitive Sciences*

*Proceedings of the National Academy of Science, USA**,*110 (15), E1330. [CrossRef]

*, 3 (11), 52–61. [CrossRef]*

*Journal of the Optical Society of America A**. New York: Wiley.*

*Signal detection theory and psychophysics**, 9 (11): 1, 1–13, http://www.journalofvision.org/content/9/11/1, doi:10.1167/9.11.1. [PubMed] [Article] [CrossRef] [PubMed]*

*Journal of Vision**, 46 (3), 269–299, doi:10.1006/jmps.2001.1388. [CrossRef]*

*Journal of Mathematical Psychology**, 29 (1), 41–59. [CrossRef]*

*Annals of Mathematical Statistics**, 13 (8): 1, 1–10, http://www.journalofvision.org/content/13/8/1, doi:10.1167/13.8.1. [PubMed] [Article] [CrossRef] [PubMed]*

*Journal of Vision*

*Journal of Neuroscience**,*25 (43), 9907–9912, doi:10.1523/JNEUROSCI.2197-05.2005. [CrossRef] [PubMed]

*, 76 (1), 64–72. [CrossRef] [PubMed]*

*Attention, Perception & Psychophysics**, 70 (5), 772–788, doi:10.3758/PP.70.5.772. [CrossRef] [PubMed]*

*Perception & Psychophysics**, 23 (11), 981–986, doi:10.1016/j.cub.2013.04.039. [CrossRef] [PubMed]*

*Current Biology**, 10, 437–442. [CrossRef] [PubMed]*

*Spatial Vision**, 24 (8), 1389–1397, doi:10.1177/0956797612473759. [CrossRef] [PubMed]*

*Psychological Science**, 11 (12): 18, 1–8, http://www.journalofvision.org/content/11/12/18, doi:10.1167/11.12.18. [PubMed] [Article]*

*Journal of Vision**, 11 (12): 13, 1–11, http://www.journalofvision.org/content/11/12/13, doi:10.1167/11.12.13. [PubMed] [Article]*

*Journal of Vision**, 78 (3), 392–402. [CrossRef] [PubMed]*

*American Journal of Psychology**, 109 (24), 9659–9664, doi:10.1073/pnas.1119569109. [CrossRef]*

*Proceedings of the National Academy of Sciences, USA**, 108 (3), 550–592, doi:10.1037//0033-295X.108.3.550. [CrossRef] [PubMed]*

*Psychological Review**, 104 (49), 19204–19209, doi:10.1073/pnas.0707316104. [CrossRef]*

*Proceedings of the National Academy of Sciences, USA**(pp. 6–9). New York: Wiley.*

*Handbook of perception and human performance**Not so fast!*, 18 (3), 484–489, doi:10.3758/s13423-011-0071-3. [CrossRef]

*Psychonomic Bulletin & Review**, 76 (4), 847–858, doi:10.1016/j.neuron.2012.09.015. [CrossRef] [PubMed]*

*Neuron**, 447 (7148), 1075–1080. [CrossRef] [PubMed]*

*Nature*^{1}Weber's Law for diameter (Solomon et al., 2011) allows us to be confident that the visual system effectively perturbs logarithmically transduced circle diameters (or areas, or any arbitrary power function of circle diameters) with independent, identically distributed samples of noise when observers attempt to discriminate sizes. We recognize that equivalent noise models (e.g., the Noisy Inefficient Observer [NIO] and generalized NIO [gNIO]) --see the corresponding sections in the paper-- are difficult to reconcile with the magnitude estimation (Teghtsoonian, 1965; Chong & Treisman, 2003), because the latter suggest expansive transduction of circle diameters. Therefore, we have decided to reserve further attempts to reconcile magnitude estimation with discriminability for future discussion.

^{4}As described in Appendix 2 (see Equations A11 and A12), Monte Carlo simulations were required to estimate the contribution of stimulus noise to discrimination. We tried all combinations of

*m*(up to 8) and

*l*(up to 8) for the condition 8:1, in which the aforementioned contribution would be constant whenever there were fewer than one loop per subarray (i.e.,

*l <*1). Simulations are much more complicated for condition 8:8, because the aforementioned contribution is no longer constant when, as our data with short displays and high temporal frequencies suggest, there are fewer than one loop per subarray.

*m*Maximum effective sample size per subarray per loop in the gNIO

*N*

_{s}Number of simultaneously visible circles within each subarray

*N*

_{t}Number of successively exposed subarrays on each side of the display

*ρ*Correlation between two samples of early noise in the NIO

^{−1}Inverse standard normal distribution function

*N*

_{s}:

*N*

_{t}.

*l*(the number of times or “loops” an observer forms an independent estimate of the mean size using the same subarray), and

*m*(the maximum effective sample size of each such independent estimate). In the present experiments both

*N*

_{s}and

*N*

_{t}were either 1 or 8, hence yielding four spatio-temporal combinations 1:1, 8:1, 1:8, and 8:8. These are the last two digits appearing between parentheses in the left side expressions of the equations below. Since

*m*is the maximum effective sample size on each side of the display, it cannot exceed the total number of elements that appear on each side during a single loop. Thus, when there is at least one loop per subarray,

*l ≥*1

*⇒ m ≤ N*

_{s}. However, when there is less than one loop per subarray,

*m*can be larger than

*N*

_{s}. In the limit, when all subarrays are exposed within the same loop,

*m*is bound by the total number of elements on each side of the array,

*l ≤*1/

*N*

_{t}

*⇒ m ≤ N*

_{s}

*N*

_{t}. Furthermore, we adopt the “reasonable” (Allard & Cavanagh, 2012) assumption that all estimates are based on at least one element, i.e.,

*m*≥ 1. Consequently, and

*l*in Equation A4 because the correlation between successive samples of early noise is 0, but

*not*get divided by

*l*in Equation A3 because the correlation between successive estimates of the

*same sample of*stimulus noise is 1.

*successively displayed*elements (i.e., one at a time), the observer will pick up

*m*of the total available elements on each side during each loop. (This number will be zero on half the total number of loops because elements were presented with a duty cycle of 1/2.) When there is at least one loop per exposure (i.e.,

*l ≥*1), the observer will pick up all eight elements in the array. When all eight exposures occur within the same loop, the observer will only get a total of

*m*on each side. Thus, in general, we have: and

*l*estimates, we have but the expression for

*m*/8 < 1) of the same sample is neither 0 nor 1, but something in between, which depends on

*l*and

*m*.

*l*≤ 8 and

*m*≤ 8 using a Monte Carlo simulation. The two-parameter exponential was found to produce an excellent fit (

*R*

^{2}= 0.983) to these 8 × 8 = 64 values. Consequently, should be a fairly close approximation, even for noninteger values of

*l*and

*m*.

*m*of the total available elements on each side during each loop. When there is at least one loop per exposure, the contribution of stimulus noise to the variance of estimated averages will be one-eighth of what it was when only one subarray was exposed, i.e.