Abstract
This study investigates the size tuning of visual spatial attention using steady-state visual evoked potential (SSVEP) to understand how visual attention efficiently adapts and directs to specific spatial extents. Sixteen participants performed a task involving the rapid serial visual presentation of digits of varying sizes while their brain activity was monitored using electroencephalography. The stimuli flickered at different frequencies, and participants detected target digits at specified sizes. Analysis of SSVEP amplitudes and intertrial phase coherence revealed that visual attention exhibited size tuning with the maximum attentional modulation when the attended size matched the stimulus size. A difference of Gaussian function effectively modeled the facilitation around the attended size and inhibition for adjacent sizes. These findings suggest that visual attention can precisely adjust its focus to enhance processing efficiency, aligning with the zoom lens hypothesis. Our SSVEP study provides strong neural evidence underlying the adaptability of visual attention to varying spatial demands.
Throughout a trial, we recorded brain electrical activity from 64 active preamplified Ag/AgCl electrodes mounted on an elastic cap connected to an EEG recording system (ActiveTwo system; Biosemi, Amsterdam, The Netherlands). The electrode arrangement was based on the International 10-20 System. Reference electrodes were placed on both ear lobes (A1, A2). The system amplifier was DC-coupled, and the electrodes were filled with electrolyte gel and mounted within plastic holders on the cap. Eye movements were also recorded using electrodes placed above and below the right eye and at the right and left outer canthi (EOG). The average of all 64 channels was used as a reference in this measurement.
Trials with signals on the EOG channels that exceeded ± 100 µV were rejected as trials with a saccade or a blink. This threshold was selected based on our practice experience. On average, 6.7% of the data were rejected due to this criterion. Additionally, epochs with noticeable artifacts such as eye blinks and movement artifacts were removed from the data set.
EEG signals contained frequency contents corresponding to these luminance flickers on the display, and the amplitudes and phase of these frequency components were analyzed as indices of attentional modulation. We used signals from occipital channels (O1, O2, PO3, PO4, P1, P2, POz, Oz) because our interest is in visual attention on low-level processing, that is, luminance flickers. The EEG signals were first filtered using a band-pass filter with cutoff frequencies of 0.5 and 50 Hz to remove frequencies outside the interest. They were then segmented using triggers that indicated the start and end of stimulus flickering, which lasted 4 s. Four-second-long EEG epochs were retrieved and analyzed to obtain the amplitudes and phases of SSVEP frequencies by fast Fourier transform (FFT) after removing the trend for each epoch to eliminate gradual changes in the EEG signal during a trial. We used a Hanning window to calculate the FFT to minimize the effects of signals responding to the onset and offset of flicker stimulation. After averaging the amplitude and phase for all trials of each combination of frequency and size for each of the four target sizes, the amplitude and phase were normalized to the z-score, isolating the effect of attention from physical stimulus conditions, that is, the effect of frequency and size of stimulation. The z-scores represent the effect of attention size on the SSVEP data of a particular frequency of a particular stimulus size. This normalization was performed separately for each participant and for each combination of frequency and stimulus size, eliminating influences due to varying sensitivities for individual participants as well as for differences in physical stimulation. The z-scores were averaged over four frequencies for each size to show influence from four attention conditions. Normalized amplitudes were averaged over trials with the same digit size from the same attention condition independently of frequencies. This was done to obtain the amplitude change for each attended digit size, which was assumed to reflect the attentional window. If the responses to the SSVEP of a certain stimulus size vary for different attention conditions, the change should be due to attentional modulation. That is, the analysis should show how much the attention to a certain stimulus size influences the processing of other sizes of stimuli. The change in attentional modulation, depending on the size of the attended stimulus, was obtained for each stimulus size. Four sets of attention functions were then combined as a function of the relative size difference between the stimulus of interest and the size of the attended stimulus. We analyzed intertrial phase coherences (ITPCs) as an index of the phase coherence of neural responses. ITPCs were calculated from the phase variation of the flicker frequency components across trials. A higher ITPC indicated a greater degree of phase coherence and larger attentional modulation. As for amplitude, we summarized the ITPC for four sets of SSVEP results responding to four different stimulus sizes and combined in one function in terms of the relative size between the stimulus of interest and the size of the attended stimulus.
Participants performed a target detection task in which they were instructed to detect twin digits among sequentially presented numbers. At the beginning of each block, participants were informed of the specific target size they should attend to, ensuring focused attention on the designated size for accurate task performance. The number of detections was 2.8 times per trial, and the FA rate was 1.2 times per trial on average. The detection rates at each target size were as follows: 82.4% for size 13.8°, 71.9% for size 7.0°, 63.8% for size 3.5°, and 66.9% for size 1.8°. Additionally, FA rates were measured to ensure that responses outside the target locations were minimal. The overall FA rate across all attention conditions was 7.5%, with FA rates of 6.0% for size 13.8°, 5.2% for size 7.0°, 11.6% for size 3.5°, and 7.2% for size 1.8°. FA rates were consistently lower than detection rates, further confirming that participants primarily attended to the instructed sizes. These behavioral results confirm that participants selectively attended to the instructed sizes, providing a robust foundation for interpreting the subsequent SSVEP and ITPC analyses.
We summarized the SSVEP amplitudes and ITPCs derived from 16 combinations of stimulus size and attended size in terms of attentional modulation in
Figure 3, which shows the averaged data of all participants. To elucidate the attentional modulation for each stimulus size, the SSVEP to each stimulus size was plotted as a function of the attended size. The function is assumed to demonstrate the attention tuning to size. The left-side panels in
Figure 3 show the results of SSVEP amplitudes, and the right-side panels show those of SSVEP ITPCs. The four panels for each amplitude and ITPC indicate the results of 13.8°, 7.0°, 3.5°, and 1.8° stimuli from top to bottom, as shown by the icons in the middle. The four data points in each panel indicate the data from different attention conditions for the size of the stimulus indicated by the icon. Because the data are
z-scores, which were used for the average and
SD of data points in each panel 0 and 1, the values cannot be compared among different stimulus sizes (different panels). Normalization was performed to extract the relative differences among the four attention conditions from the EEG for the same stimulus, which shows the shape of the attentional modulation functions.
The function in each panel in
Figure 3 shows the attentional modulation for EEG signals tagged to a certain stimulus size by comparing the effect of attended size, that is, how SSVEPs from physically identical stimuli were altered by the attended stimulus size. The peak of the function was found when attention was on the stimulus from which the SSVEP signals were obtained. This was true for all four stimulus sizes and for both amplitude and ITPC, although the ITPC results were less clear than those of amplitude.
We applied two-way analysis of variance (ANOVA) (4 stimulus sizes × 4 attended sizes) for amplitude and ITPC separately. The amplitude results showed that both the main effects (attended sizes and their interaction) were statistically significant (F(3, 208) = 41.512, p < 0.001 for attended size; F(9, 208) = 144.82, p < 0.001 for their interaction). These results indicated that the attended stimulus size influences the SSVEP amplitude. Similarly, the ITPC results showed that both the main effects (stimulus sizes and their interaction) were statistically significant (F(3, 208) = 53.5, p < 0.001 for attended size; F(9, 208) = 19.06, p < 0.001 for their interaction). These results indicate that the attended stimulus size influences SSVEP amplitude.
The influence of the attended size on the amplitude within each size condition was assessed through a series of ANOVAs. In the ITPC data, for the 13.8° stimulus, attended size did not significantly affect amplitude values (F(3, 52) = 1.83, p = 0.153). However, for the 7.0° stimulus, attended size had a significant impact on amplitude values (F(3, 52) = 23.44, p < 0.001). Similarly, a significant effect of attended size on amplitude values was found for the 3.5° stimulus (F(3, 52) = 24.79, p < 0.001) and for the 1.8° stimulus (F(3, 52) = 7.75, p < 0.001).
In the SSVEP amplitude data, the results showed that attended size significantly affected amplitude values for all stimulus sizes: for the 13.8° stimulus (F(3, 52) = 26.62, p < 0.001), for the 7.0° stimulus (F(3, 52) = 102.98, p < 0.001), for the 3.5° stimulus (F(3, 52) = 109.21, p < 0.001), and for the 1.8° stimulus (F(3, 52) = 87.70, p < 0.001). These analyses indicate that attended size significantly affects SSVEP signals under most conditions. This comprehensive analysis underscores the specific interplay between attended size and stimulus size in modulating both SSVEP amplitude and ITPC, revealing a mechanism of the attentional modulation of size tuning. When participants directed their attention to a specific stimulus size, the SSVEP signals corresponding to that attended size were significantly stronger than those of the other sizes. The strongest signal was observed when the stimulus being measured was attended. It is important to point out that these amplitude results show the inhibitory effect of attention to an adjacent size, that is, the signal was reduced when attention was oriented to the size that was one level smaller or larger. However, the ITPC results did not show a clear inhibitory effect.
The change found in attentional modulation supports the tuning of the size dimension of visual attention: Attentional focus can be adjusted to match stimulus size. This finding aligns with previous studies that have demonstrated the scalability of attentional focus in response to varying spatial demands (e.g.,
Intriligator & Cavanagh, 2001;
Müller & Kleinschmidt, 2003;
Chow, Jingling, & Tseng, 2013;
Jingling, Tseng, & Zhaoping, 2013). This is in line with the belief that the neural mechanisms underlying visual attention are highly adaptable, allowing for the precise allocation of resources to enhance processing efficiency by attending to the spatial characteristics of the visual stimuli such as size.
We also analyzed ERP to target presentations (a) to examine whether a conventional EEG measure shows attention modulation at the attended stimulus size and (b) to estimate size tunings of attention measured by ERP. We expected larger ERPs to twin digits of attended target size than the same twin digits when attention focused on other sizes. The results indeed showed the clear effect of attention, as shown in
Figure 5. Four panels represent different attention conditions. All of them show that the twin digits with the largest ERP response were observed on the attended size among four sizes of twin digits. That is, ERP response to the same stimulus varied depending on which stimulus size to attend to. Since the ERP is another index of attention, the results indicate that participants’ attention was controlled as instructed.
Figure 6 shows the P3 component of ERP responses as a function of relative size to attended size. P3 values are the average of ERP between 250 and 400 ms after target (or nontarget twin-digit) onset. Although a clear peak appears at the center, as in the case of SSVEP analysis, there were almost no differences among ERPs for the other locations, where attention was focused on other sizes. This is different from the tuning function found with SSVEP (
Figure 5), which is approximated by a DoG function. Although we considered fitting the ERP results with a DoG function, we found that the least squares method did not work because it returned with one large value and four similarly small values without gradual change of the attentional effect: Any function with a central peak with about the same values for the other locations can explain the results.
To statistically assess this effect, we conducted a two-way ANOVA (4 stimulus sizes × 4 attended sizes) on the ERP amplitudes. The results revealed that both the main effect of attended size and the interaction between stimulus size and attended size were statistically significant (F(3, 208) = 8.83, p < 0.001 for attended size; F(9, 208) = 23.928, p < 0.001 for their interaction). These findings indicate that attended size modulates ERP amplitudes and that this modulation interacts with the physical size of the stimuli.
To further explore the influence of attended size within each stimulus size condition, we conducted a series of one-way ANOVAs for ERP. For all four different attended sizes, it showed a significant effect on ERP amplitudes when attended by the 13.8° stimulus (F(3, 52) = 22.14, p < 0.001), the 7.0° stimulus (F(3, 52) = 34.84, p < 0.001), the 3.5° stimulus (F(3, 52) = 15.63, p < 0.001), and the 1.8° stimulus (F(3, 52) = 5.208, p = 0.003). These results confirm that attended size significantly modulates ERP amplitudes for most stimulus sizes, consistent with the SSVEP amplitude and ITPC.
This study investigated the size tuning of visual attention with SSVEP measurements. The results provide significant insights into how attention scales according to spatial extent. Our findings indicate that visual attention is not fixed but rather flexible, adapting its size to match the spatial scale of the attended stimuli. This adaptability supports the hypothesis of attention operating akin to a zoom lens, which can change its focus depending on task demands and stimulus characteristics. The observed SSVEP amplitude and ITPC changes with size of attention demonstrate that attention modulation varies as a function of stimulus size in attention. However, the present results indicate more than a change in the area covered by attention. The spatial extent of visual attention to a certain area implies the attentional facilitation within the area. If this is the case, attending to the largest size should show a similar attention effect as attending to the smallest size on the smallest stimulus under the condition of the present experiment. The present results are not consistent with this prediction. Attentional effect on any stimulus size, including the smallest one, showed the largest attentional facilitation when attention is on the stimulus of that size.
Size tuning could be conceptualized as selection of a particular size along the axis of the scale of the visual image to be processed, as
Figure 4 shows. The realization of such a size tuning could be related to the trade-off between the size of the attentional focus and the resolution of processing within that focus. As the size of the attended field increases, there is a concomitant decrease in processing resolution, aligning with previous research indicating a trade-off between the extent of attention and its precision (
Matsubara et al., 2007;
Shioiri et al., 2010). This trade-off is essential for understanding how attention is deployed in real-world scenarios where varying spatial demands require dynamic adjustments in attentional focus. This can be the mechanism underlying attention tuning to a certain size of visual stimulus.
An alternative interpretation of the results of this study is spatial attention to an exact region used for stimulus numbers. It might be possible to select an annulus area for each stimulus size in concentric stimuli if spatial attention is flexible enough to selectively cover the doughnut-shaped area of each number, ignoring the annulus inside and outside. Müller and Hübner (2002) showed the SSVEP measurements of two RSVP sequences, one large letter and one small letter, presented at the same locations. SSVEP amplitude was larger for attended size letters than for unattended size letters for both stimulus sizes. The small letters were not facilitated even within the area of the large letters attended. The results can be explained by the flexibility of the spatial shape of visual attention, and there is no need to assume size tuning. On the other hand, there are reports of experiments using stimuli with different sizes separated in space, suggesting size tuning instead of spatial selection, as noted in the Introduction. With such a function, the visual system can select the size of the stimulus to process by attention, as suggested in previous studies, investigating visual processes with different size stimuli. Experiments with different sizes presented at different locations without spatial overlaps are important, and we are working on such an experiment.
The present results did not show differences between SSVEP amplitude and ITPC. The size-tuning results were similar with enhancement and inhibition around the size of the attention focus, and the relative effect of the two factors (An/Ap) did not exhibit a statistically significant difference. This is consistent with previous research with SSVEP. Similar spatial tuning around the focus of attention was reported for amplitude and ITPC data (
Shioiri et al., 2016), whereas there may be a difference in the temporal aspect (
Kashiwase et al., 2012). The similarity between the SSVEP amplitude and ITPC suggests that the increase in neural outputs can at least partially be explained by synchronization among different neural units, although the previously reported difference in the temporal aspect between the two indices (
Kashiwase et al., 2012) suggests that the increase in neural outputs may not be solely explained by synchronization among different neural units, as supported by a meta-analysis of SSVEP data (
Adamian & Andersen, 2024).
Notably, the ERP results revealed a sharper and more selective size-tuning profile compared to the broader tuning observed in the SSVEP amplitudes. This finding highlights the distinction between early, broad attentional modulation reflected in SSVEP and the more selective attentional processing at later stages of visual processing, as captured by ERP amplitudes. These observations align with findings from previous work by
Shioiri et al. (2016), which demonstrated that SSVEP reflects a broad spread of attention, whereas ERP components, such as the P3, reflect a narrower selection mechanism involving inhibition of nontarget information. The results suggest that the tuning between ERP and SSVEP differs, although we have no statistical test available to compare the shape of the tuning functions.
A potential concern in our study is whether the observed SSVEP responses might be confounded by decision- and motor-related neural processes, as all trials contained targets. Several observations converged to show that this concern is unlikely, and our findings accurately reflect attentional modulation in size-specific dimensions. First, the SSVEP is a frequency-tagged neural response directly tied to the flicker frequency of visual stimuli, and each stimulus size in our study was uniquely assigned to a specific flicker frequency. This frequency specificity ensures that the signals we measured are primarily driven by sensory input from the attended stimuli, with minimal influence from nonsensory neural processes, such as decision- or motor-related activities, which are not frequency specific. Second, our primary interest lies in the modulation of SSVEP amplitude and phase across attention conditions rather than absolute neural activity levels. Decision- and motor-related processes, as well as stimulus size or location, are likely to influence all conditions uniformly, acting as additive factors that do not obscure the size-specific tuning patterns observed in our results. Third, complementary ERP analyses support the robustness of our findings. The ERP results revealed distinct attentional modulation patterns across different size conditions, with the largest responses consistently observed when stimuli were at the attended size. This demonstrates that ERP responses varied depending on attention condition, showing a clear focus of attention on specific sizes, consistent with the SSVEP findings despite minor differences in tuning shape. Although the ERP results could be attributed to decision/motor-related responses, the shape of the size tuning was very different: a peak at the attended size with no inhibition and gradual reduction as a function of the distance from the peak. Taken together, these findings confirm that the size tuning observed in our study reflects attentional modulation, independent of decision- or motor-related influences, providing strong evidence for size-specific attentional tuning as an intrinsic feature of visuospatial attention.
The current study applied a DoG model to characterize attentional modulation across relative sizes of stimuli. While the DoG model effectively captures the facilitation and inhibition processes observed in our data, its relationship with other existing models, particularly the normalization model of attention (
Reynolds & Heeger, 2009), warrants further consideration. Below, we discuss the distinctions between our findings and the predictions of the normalization model, as well as the broader implications for understanding attentional mechanisms.
Reynolds and Heeger’s normalization model provides a robust framework for explaining attentional modulation in scenarios involving competition between attended and unattended stimuli. Specifically, the model posits that neural responses are reduced or normalized when the ratio between the size of attention and the size of the stimulus increases (
Herrmann, Montaser-Kouhsari, Carrasco, & Heeger, 2010;
Itthipuripat, Garcia, Rungratsameetaweemana, Sprague, & Serences, 2014;
Zhang, Japee, Safiullah, Mlynaryk, & Ungerleider, 2016). However, applying this model to our results is not straightforward due to critical differences in the experimental conditions and the attentional dimensions being studied.
First, the attentional modulation observed in our study pertains to changes in size along a single attended dimension, rather than differences in spatial location or stimulus size. While the peak of attentional modulation in our data aligns with the general facilitation predicted by the normalization model, this correspondence assumes equivalence between the size of attention and the physical size of stimuli, which is not the case in our experimental design. Our study uniquely examines the allocation of attention across relative sizes, a dimension not explicitly addressed by the normalization model.
Second, a key distinction lies in the treatment of neighboring sizes. In our study, attentional tuning reflects not only facilitation at the target size but also inhibition of adjacent sizes. This interplay of facilitation and inhibition is a defining feature of the DoG model but is not captured by the normalization model, which lacks explicit mechanisms for addressing attentional modulation across neighboring sizes. If inhibition from adjacent sizes is considered, the attention tuning observed in our results can be interpreted as a combination of facilitation at the size of interest and suppression from both larger and smaller sizes. This interpretation highlights an important limitation of the normalization model when applied to size-tuning processes.
Third, the attentional dimensions studied here differ fundamentally from those addressed by the normalization model. While the normalization model focuses on scenarios of spatial competition between attended and unattended stimuli, our study investigates attentional tuning within a single attended dimension. Specifically, we examine the effect of stimulus size on attention allocation without introducing a condition equivalent to unattended stimuli. This focus on relative sizes rather than attended versus unattended stimuli underscores the distinct nature of our findings.
Finally, our results reflect attentional modulation across a broader distribution of neural responses rather than changes in single-cell activity. The normalization model was designed to explain neural responses at the level of specific cortical mechanisms, such as individual neurons or cortical columns, with varying stimulus sizes. In contrast, our study examines the distribution of attentional effects across sizes, making a direct application of the normalization model to our findings inappropriate.
Recent studies have demonstrated that the size of visuospatial attention can adaptively change depending on cue uncertainty and task demands, consistent with the zoom lens theory (
Feldmann-Wüstefeld & Awh, 2020;
Feldmann-Wüstefeld, Weinberger, & Awh, 2021;
Sookprao et al., 2024). These findings are primarily based on reconstruction-based models of slow alpha band activity, which reflect changes in the fidelity of attentional representation. In our study, we investigated attentional modulation in the size dimension using SSVEP and ERPs, focusing on how attention facilitates or suppresses responses across stimulus sizes. While alpha band activity captures top-down control mechanisms and dynamic flexibility in spatial attention, our results represent a different attentional process—size-specific tuning within the context of sustained attention.
The studies by Feldmann-Wüstefeld and colleagues primarily address changes in the size of spatial attention based on external factors such as cue uncertainty and task demands, aligning with the zoom lens model's predictions. This approach reflects a mechanism by which attentional allocation dynamically adjusts to optimize task performance, emphasizing spatial flexibility. In contrast, our study focused on attentional tuning along the size dimension, revealing that attention is facilitated for stimuli of a specific size while being inhibited for adjacent sizes. This reflects a distinct attentional process, where the change of attention effect along the size dimension is not directly linked to changes in the overall spatial extent of attention.
Despite these differences, an important commonality emerges: Both studies underscore the adaptability of attentional mechanisms. Whether through spatial flexibility in response to task demands or selective modulation along the size dimension, these findings highlight the nuanced ways in which attention can adapt to optimize sensory processing. The discrepancies between the results can be attributed to the different attentional dimensions under investigation—dynamic changes in spatial allocation versus size-specific attentional tuning. This interpretation is consistent with
Shioiri et al. (2016), who demonstrated distinct spatial properties for attentional spread (reflected in SSVEP) and selection/inhibition (reflected in ERP components). Together, these findings reinforce the importance of considering multiple attentional dimensions when examining the adaptability of attention.
A notable aspect of our study is its focus on the tuning of attention across relative sizes of stimuli, which contrasts with some previous studies that primarily investigated the differences between attended and unattended stimuli in peripheral locations. For example,
Phangwiwat et al. (2024) demonstrated that attentional modulations of SSVEPs depend on eccentric locations, where the size of attention and stimulus was often manipulated within the same peripheral areas. In contrast, our study focused on stimuli presented at the fovea with sizes expanding to the periphery, offering insights into a distinct attentional dimension.
Our primary interest lies in understanding how attention is distributed across different relative sizes of stimuli, presented in a concentric arrangement. While the center location of the stimuli remained fixed, the doughnut-shaped design ensured that different sizes stimulated distinct retinal areas. This design allowed us to isolate attentional modulation effects by comparing SSVEP and ERP responses to the same physical stimulation under different attentional conditions. Importantly, our findings demonstrate that attentional modulation patterns observed in SSVEP and ERP responses are robust against the potential confound of retinal stimulation area differences.
Despite this, it is worth acknowledging that the use of concentric stimuli raises questions about the generalizability of our findings. To address this, a follow-up experiment is ongoing, in which targets of different sizes are presented at peripheral locations equidistant from the fovea. This experimental design aims to minimize the influence of retinal stimulation area and to clarify whether the observed inhibitory effects are specific to concentric or doughnut-shaped stimuli or represent a more general attentional effect.