Open Access
Article  |   August 2024
The visual statistical learning overcomes scene dissimilarity through an independent clustering process
Author Affiliations
  • Xiaoyu Chen
    Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
    [email protected]
  • Jie Wang
    Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
    [email protected]
  • Qiang Liu
    Institute of Brain and Psychological Sciences, Sichuan Normal University, Chengdu, China
    [email protected]
Journal of Vision August 2024, Vol.24, 5. doi:https://doi.org/10.1167/jov.24.8.5
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Xiaoyu Chen, Jie Wang, Qiang Liu; The visual statistical learning overcomes scene dissimilarity through an independent clustering process. Journal of Vision 2024;24(8):5. https://doi.org/10.1167/jov.24.8.5.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Contextual cueing is a phenomenon of visual statistical learning observed in visual search tasks. Previous research has found that the degree of deviation of items from its centroid, known as variability, determines the extent of generalization for that repeated scene. Introducing variability increases dissimilarity between multiple occurrences of the same repeated layout significantly. However, current theories do not explain the mechanisms that help to overcome this dissimilarity during contextual cue learning. We propose that the cognitive system initially abstracts specific scenes into scene layouts through an automatic clustering unrelated to specific repeated scenes, and subsequently uses these abstracted scene layouts for contextual cue learning. Experiment 1 indicates that introducing greater variability in search scenes leads to a hindering in the contextual cue learning. Experiment 2 further establishes that conducting extensive visual searches involving spatial variability in entirely novel scenes facilitates subsequent contextual cue learning involving corresponding scene variability, confirming that learning clustering knowledge precedes the contextual cue learning and is independent of specific repeated scenes. Overall, this study demonstrates the existence of multiple levels of learning in visual statistical learning, where item-level learning can serve as material for layout-level learning, and the generalization reflects the constraining role of item-level knowledge on layout-level knowledge.

Introduction
When humans search for specific targets in familiar visual scenes, their efficiency improves. Chun and Jiang (1998) conducted a laboratory simulation of this phenomenon. In their experiments, familiar visual scenes were defined operationally as visual search trials displaying repeated arrays, whereas unfamiliar visual scenes were defined as randomly generated arrays across trials. Each of these search scene conditions accounted for 50% of the total trials. The results indicated that, after engaging in visual search for a period, participants exhibited shorter response times for repeated scenes compared with novel scenes. This finding suggests that participants became more familiar with the repeated scenes during the search task, consequently enhancing their search efficiency through these memory representations. This phenomenon, known as the “contextual cueing effect,” established the contextual cueing paradigm. Furthermore, as participants formed and used memory representations of contextual cues without explicit learning intent, conscious awareness of the learning, or the ability to recognize or verbally describe the acquired repeated scenes, contextual cue learning was considered implicit (Goujon, Didierjean, & Thorpe, 2015; Jiang & Sisk, 2020; Reber, Batterink, Thompson, & Reuveni, 2019). Subsequent extensive research has confirmed the stability of the contextual cueing effect as a phenomenon (Jiang & Sisk, 2020; Merrill, Conners, Yang, & Weathington, 2014; Ogawa, Takeda, & Kumada, 2007; Schankin & Schubö, 2009). 
In the classical contextual cueing paradigm, the search scene is typically divided into a set of square grids, and stimuli are presented within some of these grids. Additionally, to avoid collinearity issues, each item in the scenes is shifted slightly away from the center of the grid. These positional offsets remain consistent throughout the entire experiment, resulting in identical presentations of repeated scenes every time they appear. One study broke away from this conventional approach (Higuchi, Ueda, Shibata, & Saiki, 2020). In this study, positional offsets were generated randomly before the start of each trial to introduce distortion each time the same repeated scene layout reoccurred. This means that each trial involved jittering the items in the scene by a random magnitude away from the center of the reference grid. Higuchi et al., (2020) defined the maximum deviation of items from the grid center as “spatial variability.” They discovered that participants exhibited the contextual cueing effect during the learning stage when spatial variability was introduced. However, when the spatial variability increased during the testing stage, the contextual cueing effect vanished. The findings suggested that the generalization of contextual cues is limited to the learned spatial variability. 
Higuchi et al., (2020) sought to explain the phenomenon within the framework of generalization. They proposed that participants acquired an experience suggesting that repeated scenes might undergo certain distortions (Medin, Goldstone, & Gentner, 1993; Posner & Keele, 1968; Shepard, 1958a, Shepard, 1958b). According to their explanation, this experience enables the cognitive system to retrieve representations of repeated scenes accurately, regardless of these distortions. However, this explanation is rooted in the assumption that participants learn specific repeated scenes, which is practically implausible. We define the grid position set where items appear in a scene as the scene layout and consider the maximum deviation of items from the grid center as spatial variability, even if the scene layout repeats, with high spatial variability, the likelihood of two nearly identical specific scenes appearing in a short period is very low in their study. Past research has shown that perceptual primitives are combined only when simultaneously held within a spatial–attentional window limited by working memory (Jiang & Leung, 2005; Perruchet & Vinter, 1998). Furthermore, Thomas, Didierjean, Maquestiaux, & Goujon, (2018) found that scene representations can be modified or reinforced in a short time window. This evidence indicates that, during contextual cue learning, the cognitive system has the opportunity to strengthen fragile representations into long-term memory representations only when the same specific repeated scene appears in close temporal proximity. In scenes exhibiting spatial variability, particularly in conditions of high variability, the likelihood of the identical scene reoccurring in close succession is notably low. Consequently, these so-called repeated scenes might be identified as novel, making effective learning improbable. Owing to this unreasonable aspect, there is a necessity to propose and validate new hypotheses to enhance our understanding of this phenomenon. 
In attempting to solve the problems, the most crucial aspect of the new interpretation should be to avoid assuming that participants have learned specific scenes. To circumvent this assumption, we proposed an alternative explanation for this phenomenon. Our alternative explanation introduced a new concept, namely, clustering. Clustering is an unsupervised exploratory statistical data analysis method designed to group examples based on their similarities in computer science. By computing the distances between examples, the algorithm divides the examples into several independent groups (clusters) based on their close distances within groups and distant distances between groups. Once the clustering patterns are learned, the algorithm can then classify new examples into known groups (Milligan & Cooper, 1987; Rokach & Maimon, 2005). In our study, clusters are defined by the maximum range of deviation of items from the center of the grid, referred to as variability, and the position of the grid within the entire scene. We suggest that, in the beginning of the visual search task, the cognitive system samples search scenes and endeavors to categorize all the items in the scenes into spatially isolated clusters using the clustering method. We term this knowledge as clustering knowledge. After acquiring clustering knowledge, each item in new scene will be classified into its respective cluster so that the specific scenes were abstracted into layouts (Figure 1). When scenes are abstracted as layouts, the cognitive system recognizes repeated occurrences of certain scene layouts easily, and contextual cue learning begins. 
Figure 1.
 
Schematic of clustering learning. In the beginning of the visual search task, the cognitive system samples search scenes and categorize items into spatially isolated clusters (top). After acquiring clustering knowledge, each item in a new scene will be classified into its respective cluster so that the specific scenes were abstracted into layouts (bottom). The potential range of appearances of items under different variability conditions is depicted in the right panel by light gray shading.
Figure 1.
 
Schematic of clustering learning. In the beginning of the visual search task, the cognitive system samples search scenes and categorize items into spatially isolated clusters (top). After acquiring clustering knowledge, each item in a new scene will be classified into its respective cluster so that the specific scenes were abstracted into layouts (bottom). The potential range of appearances of items under different variability conditions is depicted in the right panel by light gray shading.
This explanation interprets the constrained generalization found by Higuchi et al., (2020) as exceeding the limit of learned clustering knowledge when there is increased variability during the testing phase. This leads to the inability to classify items in the scene into a specific cluster, resulting in the failure to form an abstract layout and thus causing retrieval failure and the disappearance of contextual cues. Moreover, although highlighting spatial variability as the overall distortion within repeated scenes in Higuchi et al., (2020) study, spatial variability existed in both repeated and novel layouts. Consequently, they could not eliminate the possibility of independent acquisition of spatial variability and scene layout. 
The core aspect that this study seeks to confirm is that, when the items within a scene are the spatial variable, the cognitive system first learns clustering knowledge and then uses the results of clustering for contextual cue learning. To demonstrate this explanation, it is necessary to verify two inferences: 1) learning of clustering knowledge precedes contextual cue learning, and 2) learning of clustering knowledge is independent of learning the repeated scenes. 
The aim of Experiment 1 was to validate the first inference, which is that the learning of clustering knowledge precedes contextual cue learning. Because greater spatial variability noticeably demands more samples to establish clustering knowledge, our first assumption suggests that, compared with the lesser spatial variability condition, greater spatial variability significantly delays contextual cue learning. This notion finds support in the studies by Chun and Jiang (1998) and Higuchi et al., (2020), which observed that greater spatial variability seems to require more repetitions to exhibit the same contextual cue effect compared with conditions with lesser spatial variability. Despite these research findings indicating a numerical trend where spatial variability influences the speed of contextual cue learning, a thorough exploration of this trend was lacking. Experiment 1 in this study aimed to validate the first hypothesis by determining whether larger spatial variability delays the appearance of contextual cues. We predicted that participants would demonstrate the contextual cue effect later in the high spatial variability condition compared with the low spatial variability condition, which would be indicated by notable differences between the two conditions in the early stages of learning. 
Experiment 2 aimed to validate the second hypothesis, which suggests that participants can acquire clustering knowledge through searching in novel layouts and apply this knowledge automatically to newly presented scenes. If this hypothesis holds true, participants would demonstrate a contextual cue effect more rapidly when they acquired clustering knowledge through novel layouts beforehand. To achieve this goal, Experiment 2 comprised two consecutive learning stages: the first stage involved extensive searching trials of completely novel layout scenes with spatial variability, and the second stage involved contextual cue learning with high spatial variability. We anticipated that participants would exhibit the contextual cue effect faster in the second stage if they had experienced the corresponding spatial variability information at first. These results would demonstrate that clustering knowledge can be acquired autonomously, independent of specific repeated scenes, and that cognitive systems can apply this acquired knowledge to abstract scenes automatically. 
Experiment 1: Greater spatial variability hinders contextual cue learning
Experiment 1 aimed to investigate whether greater spatial variability would lead to a delay in contextual cue learning, seeking evidence to support the hypothesis that the acquisition of clustering knowledge precedes contextual cue learning. To achieve this goal, we used the classical contextual cueing paradigm, which included novel and repeated layouts, to determine the time required for the acquisition of contextual cues. Additionally, we introduced spatial variability in item locations. Two experimental groups were formed, each tasked with scene searches: one group was exposed to high spatial variability, where items in the scene seemed to deviate more significantly from the center of their respective grid; the other group was exposed to low spatial variability, where items had smaller deviations from their grid center (see Figure 2). Our prediction was that, under conditions of high spatial variability, the onset of contextual cue learning would occur later, resulting in a delayed appearance of the contextual cue effect compared with the low variability condition. 
Figure 2.
 
Diagram of visual search scenes. The search space is uniformly divided into an 8 × 6 checkerboard grid. Each search scene comprises 12 distractors and 1 target randomly distributed across the grid. The target is a rotated “T,” and the distractors are rotated “L's,” all composed of the same line segments. The potential range of appearances of items under different variability conditions is depicted in the right panel by light gray shading. Under high variability conditions, items randomly appear within a larger range across grid squares, whereas under low variability conditions, items appear within a smaller range within grid squares.
Figure 2.
 
Diagram of visual search scenes. The search space is uniformly divided into an 8 × 6 checkerboard grid. Each search scene comprises 12 distractors and 1 target randomly distributed across the grid. The target is a rotated “T,” and the distractors are rotated “L's,” all composed of the same line segments. The potential range of appearances of items under different variability conditions is depicted in the right panel by light gray shading. Under high variability conditions, items randomly appear within a larger range across grid squares, whereas under low variability conditions, items appear within a smaller range within grid squares.
Methods
Participants
Forty-six healthy participants were recruited for this experiment, all with normal or corrected-to-normal vision, no history of mental illness, and no participation in any contextual cueing paradigm experiments in the past month. Written informed consent was obtained from all participants. Participants were assigned randomly to two groups with equal numbers, one completing visual searches under low spatial variability conditions and the other under high spatial variability conditions. In the low spatial variability group, there were 3 males and 20 females, with an average age of 21.7 ± 1.7 years. In the high spatial variability group, there were 4 males and 19 females, with an average age of 23.0 ± 2.5 years. Expected statistical power was calculated using G*Power 3.1 software, assuming an α level of 0.05, a moderate effect size, f = 0.25, a correlation of 0.5 for repeated measures, resulting in a statistical power, 1 − β of 0.99, indicating high statistical power. 
Apparatus
The experiment employed a 19-inch LED monitor with a resolution of 1,024 × 768 and a refresh rate of 60 Hz. Stimulus presentation and response recording were conducted using Matlab software (The MathWorks, Inc., Natick, MA) equipped with Psychtoolbox (Brainard, 1997). The laboratory lighting was comfortable, and participants used a standard computer keyboard for responses. During the experiment, participants' eyes were positioned 55 cm away from the screen. 
Experimental stimulus
The background color displayed on the screen was gray (RGB = 120, 120, 120). The spatial scope of the search scene measured 16.0° × 12.0° in visual angle, with its center coinciding with the screen center. The search scene was uniformly divided into an 8 × 6 grid, totaling 48 grids, with 12 grids in each of the four quadrants of the vertical coordinate system centered at the screen's midpoint. Each grid measured 2.0° × 2.0° and stimuli were presented around the center points of these grids. Each visual search scene consisted of 13 stimulus items, comprising 12 distractors and 1 target. All stimuli were composed of two identical black lines (RGB = 0, 0, 0) and measured 0.8° × 0.8° (Figure 2). The distractors were the letter “L” rotated randomly at angles of 0°, 90°, 180°, or 270°. The target was the letter “T” rotated either clockwise or counterclockwise by 90°. 
Before the experiment, 16 target positions were randomly selected from the 48 grids. Eight of these positions were designated as targets for repeated layouts, and the remaining eight were allocated for novel layouts. In each layout condition, two target positions were assigned to each quadrant to prevent spatially uneven distribution, which could cause attentional bias. The layout generation process began by initially assigning the grid for the target, followed by randomly selecting 12 grids from the remaining 47 to place the distractors. To maintain a uniform spatial distribution of items within scenes, each quadrant placed three distractors. For the repeated layouts, each target position was associated with a set of fixed distractor positions that remained constant throughout the experiment. However, for the novel layouts, distractor positions were randomly generated before each trial. 
To introduce spatial variability to the layout, each item presented deviated from the center of its grid by a random distance and direction in each trial. In the high variability condition, stimuli appeared randomly within a 1.8° × 1.8° square area centered on the selected grid, whereas in the low variability condition, this range was 1.0° × 1.0° (Figure 2). This spatial variability setup was inspired by the study of Higuchi et al., (2020); however, they selected item locations using a Gaussian distribution, whereas we used a uniform distribution. The Gaussian distribution features a higher probability in the center and a lower probability in the periphery, resulting in items being more likely to appear near the center of the grid. This condition leads to greater similarity among specific scenes belonging to the same repeated layout presented multiple times, which is equivalent to decreasing the overall proportion of specific repeated scenes. Because the decrease in the proportion of repeated scenes is one of the interfering factors in contextual cue learning (Zinchenko, Conci, Müller, & Geyer, 2018), the results obtained under these circumstances cannot entirely confirm the decreased contextual cue effects related to variability size. The uniform distribution does not encounter this issue. 
The formal experiment comprised 20 blocks of contextual cueing search trials. Each block consisted of 16 trials, with 8 trials involving repeated layouts and 8 trials involving novel layouts. Every participant in the different variability groups had to perform 320 visual search trials during the formal experiment. 
Trial procedure
A trial begins with the presentation of a black cross in screen center (RGB = 0, 0, 0, 1.0° × 1.0°), displayed for 1,000 ms. Following the fixation point, the search scene appears. Participants had to determine the orientation of the target “T” within the scene quickly and accurately. They are instructed to press the “J” if the target is rotated clockwise or the “F” if it is rotated counterclockwise. Once a response is made, the search scene disappear, and the next trial begins. If a participant fails to respond within 5 seconds, the next trial starts automatically, and the unresponsive trial is considered an erroneous response trial. 
Experimental procedure
The experiment comprises three sections. The first section involves a practice task consisting of 24 trials. In this section, the experimenter informs the participants about the experimental instructions and familiarizes them with the instructions. After this section is the formal experimental section, where participants complete 20 blocks of contextual cueing search trials. Throughout the formal experiment, participants can take breaks after every five blocks, and they can restart by pressing the space key. 
The third section involves a recognition task. This task comprises 16 trials, with 8 trials presenting repeated layouts encountered during the formal experiment and 8 trials displaying randomly generated novel layouts. These layouts are presented in a random order, and participants are instructed to indicate their familiarity with each layout by pressing the “F” key if they feel familiar or the “J” key if they do not. This task was not told in advance. 
Statistics
We calculated each participant's accuracy rate, and removed participants with accuracy rates of less than 80%. Regarding reaction times, we excluded the following trials: incorrect trials, trials with no response within 5 seconds, trials with reaction times shorter than 200 ms, and extreme outliers determined by the 3σ criterion. The mean reaction time for each participant under each experimental condition was then calculated using the filtered data. Owing to the limited number of trials per block (8 repeated layout trials and 8 novel layout trials), significant chance errors can occur. To address this issue, we adopted a common approach from previous studies by collapsing data from five consecutive blocks into one “epoch” (Chun & Jiang, 1998; Vaskevich & Luria, 2019). 
Finally, the statistical analysis in this experiment was simplified to a mixed design involving 2 (spatial variability: high vs. low) × 2 (layout type: repeated vs. novel) × 4 (time course: epochs 1–4). Spatial variability served as the between-subject factor, while layout conditions and time course were within-subject factors. The dependent variable was the average reaction time under each condition. 
Regarding the results of the recognition task, the average recognition rates were calculated separately for the high variability group and low variability group. These rates were then compared against a chance level of 50% using a one-sample t-test to determine whether participants could recognize the repeated layouts that appeared. 
Results
In the main experiment, the average accuracy is 98.69% ± 0.83%. Descriptive statistics of reaction times are presented in Table 1 and illustrated in Figure 3. A mixed-design repeated measures analysis of variance (ANOVA) was conducted to examine the impact of three factors (spatial variability, layout type, and time course) on reaction times. Results indicated a significant main effect of the within-subject factor, layout type, F(1, 44) = 17.40, p < 0.001, η² = 0.28. Post hoc tests revealed faster reaction times for repeated layouts, M = 824 ms, SE = 21 ms, compared with novel layouts, M = 863 ms, SE = 22 ms, p < 0.001, Bonferroni corrected, suggesting an overall occurrence of the contextual cueing effect in the experiment. The main effect of time course was also significant, F(2.34, 102.96) = 32.77, p < 0.001, η² = 0.43, Greenhouse-Geisser corrected. Post hoc tests revealed significantly slower reaction times in epoch 1 compared with epoch 2, p < 0.001, Bonferroni corrected. However, there were no significant differences in reaction times between epoch 2 and the subsequent epochs, p > 0.05, Bonferroni corrected, demonstrating the typical practice effect. The between-subject factor, spatial variability, did not show a significant main effect, F(1, 44) = 3.13, p = 0.084, indicating no average reaction time differences between the two participant groups, thus confirming the rationality of participant grouping. 
Table 1.
 
Comparison of reaction times between novel and repeated layouts in Experiment 1. Notes: The significantly faster reaction times for the novel layouts compared with the repeated scenes imply the presence of contextual cueing effect. The statistical method is paired-sample t-test.
Table 1.
 
Comparison of reaction times between novel and repeated layouts in Experiment 1. Notes: The significantly faster reaction times for the novel layouts compared with the repeated scenes imply the presence of contextual cueing effect. The statistical method is paired-sample t-test.
Figure 3.
 
Results of Experiment 1. The horizontal axis represents the time course. Blue diamonds represent reaction times (RT) for novel displays, and orange diamonds represent RT for repeated displays. Error bars indicate the standard error of the mean.
Figure 3.
 
Results of Experiment 1. The horizontal axis represents the time course. Blue diamonds represent reaction times (RT) for novel displays, and orange diamonds represent RT for repeated displays. Error bars indicate the standard error of the mean.
Regarding interactions, the three-way interaction of spatial variability × layout type × time course was not significant, F(3, 132) = 0.43, p = 0.735. Similarly, the interactions of spatial variability × time course, F(3, 132) = 2.264, p = 0.194, and layout type × time course, F(3, 132) = 1.39, p = 0.250, were also not significant. However, the interaction of spatial variability × layout type was significant, F(1, 44) = 4.61, p = 0.037, η² = 0.10. Simple effects analysis revealed that under low variability condition, reaction times for novel layouts, M = 910 ms, SE = 32 ms, were longer than for repeated layouts, M = 852 ms, SE = 29 ms, p < 0.001, Bonferroni corrected, exhibiting a typical contextual cueing effect. Conversely, under the high variability condition, there were no significant differences between reaction times for novel layouts, M = 816 ms, SE = 32 ms, and repeated layouts, M = 797 ms, SE = 29 ms, p = 0.160, Bonferroni corrected, (Figure 2). These results indicate that contextual cue learning was notably affected under high variability conditions. 
To ensure comparability between repeated and novel layouts, we conducted paired-sample t-tests comparing participants' reaction times in the first two blocks for both novel and repeated layouts across both spatial variability groups. Results revealed no significant differences between the two layout types for either spatial variability group, p > 0.1, confirming the comparability between the two layout types. 
Because there was no interaction related to the time course discovered, and this phenomenon was not caused by inherent differences between repeated and novel layouts, we confirmed that greater spatial variability impedes contextual cue learning. 
Regarding the recognition task, the single-sample t-test indicated no statistically significant difference between the recognition performance for low variability, M = 49.73%, SD = 10.40%, t(22) = −0.13, p = 0.901, and high variability, M = 48.64%, SD = 13.71%, t(22) = −0.48, p = 0.639, groups compared with the chance level of 50%. This suggests that participants were unable to recognize previously learned repeated layout, indicating that contextual cue learning is implicit. 
Discussion
The results of Experiment 1 indicate that high spatial variability affects contextual cue learning. In the high spatial variability group, participants did not exhibit the contextual cue effect, even after four epochs, whereas in the low variability group, participants could exhibit contextual cue effect in the first epoch. Concerning the reason of this affection, we hypothesized originally that participants must acquire clustering knowledge before contextual cue learning, so greater variability might delay the onset of contextual cue learning because learning clustering knowledge needed more samples compared with the lower variability condition. However, our current findings did not support this result firmly because they indicated that participants did not demonstrate the contextual cue effect in the final epoch under the high variability condition. Obviously, this phenomenon could be explained in two ways. First, as we proposed, the cognitive system do not fully acquire clustering knowledge at the end of the task, so cannot initiate contextual cue learning. Alternatively, our assumption might be invalid, the cognitive system is incapable of learning clustering knowledge, leading to the challenges in recognizing repeated layouts in high variability condition. To address this issue, we increased the number of blocks dedicated to contextual cue learning in Experiment 2
Experiment 2: Learning of clustering knowledge independent of specific repeated scenes
The aim of Experiment 2 was to validate that the acquisition of clustering knowledge is independent of specific repeated scenes. Additionally, because Experiment 1 only confirmed that significant variability hinders contextual cue learning and cannot serve directly as evidence that substantial variability delays contextual cues, Experiment 2 needs to include more epochs while maintaining identical experimental stimulus parameters, the number of repeated layouts to be learned, and the ratio of repeated layouts to novel scenes as in Experiment 1 to ensure the detection of contextual cue learning. 
Our initial assumption suggests that participants can acquire clustering knowledge through searches in spatially variable novel layouts. Consequently, individuals engaged in these searches featuring spatially variable novel layouts beforehand may demonstrate the contextual cue effect more rapidly when encountering repeated layouts. To test this hypothesis, we designed two consecutive, unannounced visual search stages in Experiment 2. In the first stage, termed the “prelearning stage,” participants perform extensive searches involving novel layouts with spatial variability to actively acquire clustering knowledge. In the second stage, named the “contextual cue learning stage,” one-half of the trials are transformed into repeated layouts. Here, participants engage in contextual cue learning with spatial variability to determine whether the knowledge acquired from the clustering stage facilitates contextual cue learning. 
In the prelearning stage of Experiment 2, two distinct levels of spatial variability were used: high and low variability conditions. Under the high variability condition, participants engaged in novel layout searches characterized by the same spatial variability as the subsequent stages of contextual cue learning. In contrast, in the low variability condition, which served as a control, participants conducted novel layout searches with a variability smaller than that in the contextual cue learning stage. Greater variability was not chosen because the range of small variability is contained within greater variability. Control was done to prevent premature learning corresponding to the clustering knowledge required in the contextual cue learning stage and to balance the potential interference of entirely novel scene searches on subsequent contextual cue learning (Jungé, Scholl, & Chun, 2007; Vaskevich & Luria, 2019). There were no cues provided between the first and second stages to avoid guiding participants to use clustering knowledge proactively or initiate contextual cue learning. Additionally, to ensure the detection of contextual cue learning, we doubled the number of blocks in the contextual cue learning stage compared with Experiment 1
Methods
Participants
Experiment 2 recruited 55 healthy participants under the same standards as Experiment 1. All participants were assigned randomly to two groups completing the experimental conditions for the prelearning stage with either low variability or high variability. All participants provided written informed consent. The low variability group comprised 27 participants, including 3 males and 24 females, with an average age of 20.0 ± 1.69 years. The high variability group recruited 28 participants; however, 1 participant with an accuracy rate of less than 70% was excluded, leaving 27 effective participants in this group, including 5 males and 22 females, with an average age of 21.3 ± 1.77 years. 
The expected statistical power was calculated using G*Power 3.1 software with parameters identical to Experiment 1, resulting in a statistical power of 0.99, indicating a high level of statistical power. 
Apparatus and Procedure
Experiment 2 used the same equipment, apparatus, stimuli, response methods, and scene settings as Experiment 1
The formal experimental section of Experiment 2 comprised two stages. In the first stage, known as the prelearning stage, participants conducted 320 trials of novel layout visual searches across 20 blocks, encompassing 16 trials per block. During this stage, participants performed visual searches under either high or low spatial variability conditions. In the second stage, termed the contextual cue learning stage, both groups engaged in contextual cue learning under high spatial variability. Each block in the contextual cue learning stage included 16 trials, with 8 trials presenting repeated layouts and 8 trials presenting novel layouts. Participants completed 40 blocks, totaling 640 trials of contextual cue learning. There were no cues or indications provided between these two stages. 
In Experiment 2, participants initially completed a practice task comprising 24 trials to familiarize themselves with the task instructions. Following this, they proceeded to the formal experiment and the recognition task. Data from both the formal experiment and the recognition task were recorded for subsequent statistical analysis. 
Statistics
Similar to Experiment 1, we excluded reaction times from error trials and extreme values in Experiment 2. Additionally, to tackle the issue of a limited number of trials within individual blocks, we implemented a collapsing strategy where five consecutive blocks formed an epoch. This epoch served as the minimum unit for the time course. Statistical analyses were performed separately for data derived from both the prelearning stage and the contextual cue learning stage. 
A 2 (spatial variability: high variability vs. low variability) × 4 (time course: epochs 1–4) 2-factor mixed-design repeated-measures ANOVA was conducted on reaction time data from the prelearning stage, with spatial variability serving as the between-subject factor. For the reaction time data from the contextual cue learning stage, we used a mixed-design repeated-measures three-factor ANOVA, incorporating 2 (variability in the prelearning stage: high variability vs. low variability) × 2 (layout type: repeated layout vs. novel layout) × 8 (time course: epochs 5–12), with variability in the prelearning stage as the between-subject factor. 
Regarding the results of the recognition task, the average accuracy for both the high variability and low variability groups were computed separately. Single-sample t-tests then compared these accuracies against a chance level of 50% to determine whether participants can recognize the presented repeated layouts. 
Results
The average accuracy in the visual search task was 98.31% ± 1.62%. 
Reaction times of the prelearning stage
Descriptive statistics of reaction times in the prelearning stage are presented in Table 2 and illustrated in Figure 4 in light yellow background. A 2 (spatial variability: high variability vs. low variability) × 4 (time course: epochs 1–4) two-way mixed-design ANOVA revealed that the between-subject factor, spatial variability, did not yield a significant main effect, F(1, 52) = 2.38, p = 0.129, suggesting that variability itself does not influence search efficiency. The main effect of the time course was significant, F(2.34, 121.69) = 44.91, p < 0.001, η² = 0.46, Greenhouse-Geisser corrected. Post hoc comparisons revealed significant decreases in reaction times from epoch 1 to epoch 3, p < 0.05, Bonferroni corrected. However, there was no significant difference between epoch 3 and epoch 4, p = 0.183, Bonferroni corrected, indicating a practice effect, with the effect plateauing between epoch 3 and epoch 4. Furthermore, the interaction effect was not significant, F(2.34, 121.69) = 0.16, p = 0.920, suggesting no interaction between time course and spatial variability during this stage. 
Table 2.
 
Comparison of reaction times (RT) between two variability conditions in the prelearning stage of Experiment 2. Notes: This stage comprises visual search scenes with novel layouts only. The statistical method is independent-samples t-test.
Table 2.
 
Comparison of reaction times (RT) between two variability conditions in the prelearning stage of Experiment 2. Notes: This stage comprises visual search scenes with novel layouts only. The statistical method is independent-samples t-test.
Figure 4.
 
Results of Experiment 2. The horizontal axis represents the time course. The light yellow background represents the prelearning stage, and the white background represents the contextual cue learning stage. The top represents the condition of presenting scenes with high variability during the prelearning stage, and the bottom represents the condition with low variability during the prelearning stage. Blue diamonds represent RT for novel displays, and orange diamonds represent RT for repeated displays. Error bars indicate the standard error of the mean.
Figure 4.
 
Results of Experiment 2. The horizontal axis represents the time course. The light yellow background represents the prelearning stage, and the white background represents the contextual cue learning stage. The top represents the condition of presenting scenes with high variability during the prelearning stage, and the bottom represents the condition with low variability during the prelearning stage. Blue diamonds represent RT for novel displays, and orange diamonds represent RT for repeated displays. Error bars indicate the standard error of the mean.
Reaction times of the contextual cue learning stage
The descriptive statistics for the contextual cue learning stage are presented in Table 3 and illustrated in Figure 3 in white background. 
Table 3.
 
Comparison of RT between novel and repeated layouts in the contextual cue learning stage of Experiment 2. Notes: The significantly faster reaction times for the novel layouts compared with the repeated scenes imply the presence of contextual cueing effect. The statistical method is paired-sample t-test.
Table 3.
 
Comparison of RT between novel and repeated layouts in the contextual cue learning stage of Experiment 2. Notes: The significantly faster reaction times for the novel layouts compared with the repeated scenes imply the presence of contextual cueing effect. The statistical method is paired-sample t-test.
A 2 (variability in the prelearning stage: high variability vs. low variability) × 2 (layout type: repeated vs. novel layout) × 8 (time course: epochs 5–12) repeated-measures ANOVA was conducted on the reaction time of this stage. The results indicated a significant main effect of layout type, F(1, 52) = 12.39, p = 0.001, η² = 0.19. Post hoc tests revealed that, compared with the novel layout, M = 875, SE = 16, the reaction time for the repeated layout, M = 846, SE = 17, was significantly faster, p = 0.001, Bonferroni corrected, indicating an overall presence of the contextual cue effect in this stage. Moreover, there was a significant main effect for time course, F(4.45, 231.5) = 26.99, p < 0.001, η² = 0.34, Greenhouse-Geisser corrected, where post hoc showed a significant decrease in reaction times from epoch 5 to epoch 8, p < 0.05, Bonferroni corrected, but no significant differences in reaction times between epoch 9 and later epochs, p > 0.05, Bonferroni corrected, indicating a practice-like trend. However, the main effect of variability in the prelearning stage was not significant, F(1, 52) = 0.45, p = 0.507, suggesting no overall significant differences in average reaction times between the two participant groups. 
Regarding interaction effects, the three-way interaction of variability in the prelearning stage × layout type × time course was non-significant, F(7, 364) = 1.24, p = 0.279, and the two-way interactions of layout type × time course, F(7, 364) = 1.53, p = 0.155, and time course × variability in the prelearning stage, F(4.45, 231.50) = 1.70, p = 0.109, Greenhouse-Geisser corrected, were also not significant. However, the interaction between variability in the prelearning stage and layout type was significant, F(1, 52) = 5.75, p = 0.020, η² = 0.10. Further analysis showed no significant difference in reaction times between the novel layout, M = 875, SE = 22, and repeated layout, M = 866, SE = 24, p = 0.431, Bonferroni corrected, for the low variability condition. Yet, in the high variability condition, the reaction time for the novel layout, M = 874, SE = 22, was significantly longer than that for the repeated layout, M = 826, SE = 24, p < 0.001, Bonferroni corrected, revealing a pronounced contextual cue effect (Figure 5). This result indicates that extensive searches involving novel layouts with corresponding spatial variability during the prelearning stage indeed facilitated subsequent contextual cue learning, suggesting that participants did learn clustering knowledge during the prelearning stage and applied it to subsequent scene searches. 
Figure 5.
 
Post hoc results of Experiment 2. Post hoc revealed no significant difference in RT between the novel layout and repeated layout for the low variability condition (p = 0.431, Bonferroni corrected). However, in the high variability condition, the reaction time for the novel layout was significantly longer than that for the repeated layout (p < 0.001, Bonferroni corrected), revealing a pronounced contextual cue effect.
Figure 5.
 
Post hoc results of Experiment 2. Post hoc revealed no significant difference in RT between the novel layout and repeated layout for the low variability condition (p = 0.431, Bonferroni corrected). However, in the high variability condition, the reaction time for the novel layout was significantly longer than that for the repeated layout (p < 0.001, Bonferroni corrected), revealing a pronounced contextual cue effect.
Because there was no interaction related to the time course discovered, we conducted paired-sample t-tests to compare the reaction times between repeated and novel layouts in epoch 12 under the low variability group to detect the presence of contextual cueing learning. The findings revealed that participants exhibit a contextual cueing effect, t(26) = 2.569, p = 0.016, Cohen's d = 1.008 (Table 3), indicating that the low variability group eventually exhibited contextual cueing effects. 
Recognition task
Separate one-sample t-tests were conducted comparing the recognition performance of both groups against the chance level of 50%. The results revealed no significant difference between the low variability group's recognition accuracy and the chance level, M = 50.93%, SE = 0.03, t(26) = −0.348, p = 0.731, indicating that participants in the low variability group were unable to recognize the repeated layouts. However, the recognition accuracy in the high variability group were significantly higher than the chance level, M = 59.72%, SE = 0.02, t(26) = 5.381, p < 0.001, Cohen's d = 2.11. A two-sample t-test further showed a significant difference in recognition accuracy between the two groups, t(52) = 2.734, p = 0.009, Cohen's d = 0.76, suggesting that participants who had prior learning of corresponding clustering knowledge possessed explicit knowledge of the repeated layouts in their memory system. 
Discussion
In Experiment 2, both the high variability and low variability groups eventually exhibited contextual cueing effects. Therefore, this experiment confirms that the interference caused by high spatial variability observed in Experiment 1 on contextual cue learning was due to a delay rather than an inability to initiate contextual cue learning. Additionally, we discovered that participants engaged in spatially variable novel layout searches corresponding with the spatial variability in the contextual cue learning stage demonstrated a faster acquisition of the contextual cueing effect. These findings align with our expectations, indicating that the cognitive system indeed learns clustering knowledge independently of the specific repeated scenes. Moreover, the results suggest that contextual cue representations can be established by using abstracted scene layouts derived from pre-existing clustering knowledge. 
General discussion
The aim of this study was to validate that the acquisition of clustering knowledge of scenes occurs independently before contextual cue learning, thus avoiding the assumption that participants specifically learn concrete scenes when there is spatial variability in the learning materials. Our new explanation encompasses two interdependent hypotheses: first, the timing of acquiring clustering knowledge precedes contextual cue learning, and, second, the acquisition of clustering knowledge is automatic and independent of specific repeated scenes. Experiment 1 revealed that significant spatial variability within scenes hindered the manifestation of contextual cue effects, indicating that spatial variability impedes contextual cue learning. Experiment 2 further demonstrated that conducting extensive novel layout searches with high variability facilitates subsequent contextual cue learning under high variability, in contrast, conducting such searches with low variability beforehand does not have the same effect. The evidence confirms the independent and automatic acquisition of clustering knowledge, and further validates that contextual cue learning can be built on the clustering results. Furthermore, results from the recognition task suggest that acquiring corresponding clustering knowledge before contextual cue learning moderately enhances the explicit recognition of repeated layouts. 
Experiment 1 indicates that the jitter of items within scenes at a large spatial range impedes contextual cue learning. It is worth noting that the between-subject factor of spatial variability shows marginal differences in significance, p = 0.084. Recent research has indicated that, in visual search tasks, there is a shorter overall task response time when all scenes are either all repeated or all novel, compared with a mix of one-half repeated and one-half novel scenes (Vaskevich & Luria, 2018). The authors suggest that this is due to the cost of uncertainty present in the task. In our experiment, the low variability group learned repeated scenes and thus acquired the uncertainty at the scene level, while the high variability group never learned the scenes and, therefore, could not discover this uncertainty. The marginal significant difference we found seems to be due to differences in uncertainty. 
Results from Experiment 2 indicate that participants who engaged in extensive searches of novel layouts with high variability before exhibited contextual cue effects as early as the first epoch when repeated layouts appear. Extensive searches in novel scenes have been shown to lead to difficulties in subsequent contextual cue learning; the researchers suggested that observers can predict environmental regularities, and a substantial number of randomly generated scene searches prompt the cognitive system to develop an implicit belief in layout irregularity, subsequently impeding the learning process (Jungé et al., 2007; Vaskevich & Luria, 2019). Clearly, conducting more novel layout searches implies a stronger implicit belief in scene irregularity. If clustering knowledge and scene layout are learned sequentially, participants with high-variability prelearning should recognize the lack of layout regularity later than those with low-variability prelearning. This would result in a weaker implicit belief in scene irregularity, requiring fewer trials to dispel. Conversely, if clustering knowledge is not learned independently and the cognitive system learns distorted scenes, then the level of variability during prelearning would result in an equally strong implicit belief in layout irregularity. This condition would cause similar levels of interference in subsequent tasks for both types of scenes. Furthermore, the introduction of high variability would make it difficult for the cognitive system to recognize repeated layouts, making background cue learning under high-variability prelearning conditions more challenging. 
The current evidence supports the first scenario, where clustering knowledge and scene layout are learned sequentially. Specifically, if we define the time required for participants to demonstrate contextual cue effects as the duration encompassing both clustering knowledge learning and contextual cue learning, the findings from Experiment 2 under the high-variability condition indicate that the cognitive system completed both learning processes within five epochs. However, participants who previously engaged in extensive visual searches with low variability took significantly longer to exhibit the contextual cue effect, indicating a stronger implicit belief in layout irregularity compared with the high-variability group. 
Another noteworthy issue is that greater variability in item leads to decreased overall orderliness, which may interfere with search processing. This issue needs to be explored at both the scene and task levels. In the first search stage of Experiment 2, we established conditions with high and low spatial variability in novel scenes and found that spatial variability did not have a significant main effect, F(1, 52) = 2.38, p = 0.129. This result confirms that spatial variability does not significantly impact visual search processes. Thus, in our experiment, the presence of different spatial variability has negligible effects on visual search. However, at the task level, we introduced repeated layouts, and interactions may occur between the orderliness caused by repeated layouts and variability. The current results may provide partial support for the notion that a combination of both types of orderliness can interfere with learning, such as causing learning difficulties as seen in the high variability group of Experiment 1 and the Non-corresponding group of the contextual cue learning stage in Experiment 2. Nonetheless, regardless of whether this task-level orderliness has a negative impact, Experiment 2 found that subsequent contextual cue learning was indeed facilitated after participants engaged in extensive visual searches corresponding to variability. This suggests that even if the presence of interference caused by low orderliness, participants can overcome this interference by first acquiring clustering knowledge. 
This study found that participants who acquire corresponding clustering knowledge beforehand can recognize explicitly repeated layouts to some extent. Experiment 1, similar to previous studies, found that participants could not recognize repeated layouts explicitly, indicating that contextual cue learning is implicit (Chun & Jiang, 1998; Chun & Jiang, 1999, 2003; Jiang & Chun, 2001). Some arguments suggest that the representations of implicit and explicit memory share a common storage mechanism, differing only in the conscious state during retrieval, and the processing depth affects conscious states (Kroell, Schlagbauer, Zinchenko, Müller, & Geyer, 2019; Turk-Browne, Yi, & Chun, 2006). However, other studies suggest that implicit memory runs parallel to explicit memory and is unaffected by processing depth (Tulving & Schacter, 1990). In our study, participants in the low variability group in Experiment 1 underwent only 4 epochs of overlearning, whereas the corresponding group in Experiment 2 completed 8 epochs of it, indicating a higher overlearning level in Experiment 2. Additionally, compared with scenes with low variability, the cognitive system requires a greater demand for in-depth analysis when processing scenes with high variability. In conclusion, our results support the hypothesis that implicit and explicit memories share the same storage mechanism, and the conscious state is influenced by processing depth. 
Research on reinforcement learning proposes that any complex task can be decomposed into a series of independently subtasks (Botvinick, Niv, & Barto, 2009). Consistent with the core ideas of these studies, in our research, the process of generating contextual cue effects when items in the search scene jitter within a certain range can also be decomposed into two independent subprocesses: scene layout abstraction and contextual cue processing. Contextual cue is a visual statistical learning phenomenon (Goujon, Didierjean, & Thorpe, 2015). Sequence learning is also a visual statistical learning phenomenon, and it has found that the subprocesses that facilitate responses are learned independently and can make separate predictions (Goschke & Bolte, 2012). Our study further demonstrates that the output of one learning process in statistical learning can serve as the input for another learning process. 
Prior research often interprets learning's generalization as an extension of specific acquired information, implying that the cognitive system learns a prototype, and generalization relies on similarity to this prototype (Medin, Goldstone, & Gentner, 1993; Shepard, 1958a; Shepard, 1958b). In line with this, some researchers have developed mathematical models of psychological similarity using Mahalanobis distance (Cronbach & Gleser, 1953; Moon & Phillips, 2001; Rosenholtz, 1999). These models assume that the cognitive system can calculate means and variances from a large number of examples to analyze the similarity between new and learned examples. These models recognize clustering learning, but consider similarity judgments as the ultimate goal of clustering analysis, remaining reliant on prototype. In contrast, our study revealed that, in visual statistical learning, spatial variability determining the extent of generalization does not rely on prototypes but on an independent clustering analysis process. In contrast to previous prototype-based interpretations of generalization, our explanation emphasizes that learning prototypes can stem from clustering learning. Generalization determined by spatial variability is a consequence of this foundational clustering learning. 
Above all, we proposed that, during the statistical analysis of visual scenes with variability, the cognitive system engages in two distinct levels of statistical learning sequentially. The first level, termed the item level, involves participants using numerous specific scenes as examples to calculate spatial position centroids (means) for each potential stimulus item in the scene, along with the degree of deviation from this centroid when items appear (variances). In the second level, named the scene layout level, each item in newly encountered scenes is classified into a cluster, thereby abstracting specific scenes into scene layouts, and the cognitive system uses the scene layout as input to generate long-term contextual cue representations through timely reinforcement. 
As mentioned elsewhere in this article, when repeated scene layouts are combined with high variability, the probability of encountering two highly similar specific scenes in a short time is low, making it challenging to learn specific repeated scenes. The magnitude of variability during the prelearning stage leads to varying degrees of difficulty in contextual cue learning, indicating that variability influences the time at which the cognitive system identifies irregularities in scene layouts during prelearning stage. These findings confirm that variability and scene layout are learned sequentially. However, because Experiment 2 deliberately introduced a scenario where variability was learned before repeated layouts, further research is needed to investigate whether there is some degree of interaction between clustering learning and repeated scene layout learning when variability is introduced into repeated layouts during participants' initial encounter with the search task. 
Conclusions
This study establishes the presence of an independent process of learning clustering knowledge from visual scenes during visual search. This acquired knowledge is instrumental in abstracting scenes into spatial layouts, which are subsequently used for further contextual cue learning. Our findings affirm that within visual statistical learning, the outcomes of item-level statistical learning can act as inputs for layout-level statistical learning processes. Moreover, contrary to prior prototype-based interpretations of generalization, our research emphasizes that generalization may result from fundamental statistical clustering learning processes. 
Acknowledgments
Supported by grants from the National Natural Science Foundation of China (NSFC31970989 to Qiang Liu). 
Ethics approval: Ethics Committee of Liaoning Normal University approved the study (number: LL2023038), and signed informed consent was obtained from all participants. 
Informed consent: All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as received in 2000. Informed consent was obtained from all participants for being included in the study. 
Data available in https://osf.io/4x5wa/
Commercial relationships: none. 
Corresponding author: Qiang Liu. 
Address: Institute of Brain and Psychological Sciences, Sichuan Normal University, Chengdu 610000, China. 
References
Botvinick, M. M., Niv, Y., & Barto, A. G. (2009). Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective. Cognition, 113(3), 262–280, https://doi.org/10.1016/j.cognition.2008.08.011. [CrossRef] [PubMed]
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10(4), 433–436, https://doi.org/10.1163/156856897x00357. [CrossRef] [PubMed]
Chun, M. M., & Jiang, Y. (1998). Contextual cueing: Implicit learning and memory of visual context guides spatial attention. Cognitive Psychology, 36(1), 28–71, https://doi.org/10.1006/cogp.1998.0681. [CrossRef] [PubMed]
Chun, M. M., & Jiang, Y. (1999). Top-down attentional guidance based on implicit learning of visual covariation. Psychological Science, 10(4), 360–365, https://doi.org/10.1111/1467-9280.00168. [CrossRef]
Chun, M. M., & Jiang, Y. (2003). Implicit, long-term spatial contextual memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29(2), 224–234, https://doi.org/10.1037/0278-7393.29.2.224. [PubMed]
Cronbach, L. J., & Gleser, G. C. (1953). Assessing similarity between profiles. Psychological Bulletin, 50(6), 456–473, https://doi.org/10.1037/h0057173. [CrossRef] [PubMed]
Goschke, T., & Bolte, A. (2012). On the modularity of implicit sequence learning: Independent acquisition of spatial, symbolic, and manual sequences. Cognitive Psychology, 65(2), 284–320, https://doi.org/10.1016/j.cogpsych.2012.04.002. [CrossRef] [PubMed]
Goujon, A., Didierjean, A., & Thorpe, S. (2015). Investigating implicit statistical learning mechanisms through contextual cueing. Trends in Cognitive Sciences, 19(9), 524–533, https://doi.org/10.1016/j.tics.2015.07.009. [CrossRef] [PubMed]
Higuchi, Y., Ueda, Y., Shibata, K., & Saiki, J. (2020). Spatial variability induces generalization in contextual cueing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 46(12), 2295–2313, https://doi.org/10.1037/xlm0000796. [PubMed]
Jiang, Y., & Chun, M. M. (2001). Selective attention modulates implicit learning. Quarterly Journal of Experimental Psychology Section A, 54(4), 1105–1124, https://doi.org/10.1080/713756001. [CrossRef]
Jiang, Y., & Leung, A. W. (2005). Implicit learning of ignored visual context. Psychonomic Bulletin & Review, 12(1), 100–106, https://doi.org/10.3758/BF03196353. [PubMed]
Jiang, Y. V., & Sisk, C. A. (2020). Contextual Cueing. In Pollmann, S. (Ed.), Spatial learning and attention guidance (pp. 59–72). Springer US, https://doi.org/10.1007/7657_2019_19.
Jungé, J. A., Scholl, B. J., & Chun, M. M. (2007). How is spatial context learning integrated over signal versus noise? A primacy effect in contextual cueing. Visual Cognition, 15(1), 1–11, https://doi.org/10.1080/13506280600859706. [PubMed]
Kroell, L. M., Schlagbauer, B., Zinchenko, A., Müller, H. J., & Geyer, T. (2019). Behavioural evidence for a single memory system in contextual cueing. Visual Cognition, 27(5–8), 551–562, https://doi.org/10.1080/13506285.2019.1648347.
Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for similarity. Psychological Review, 100(2), 254–278, https://doi.org/10.1037/0033-295X.100.2.254.
Merrill, E. C., Conners, F. A., Yang, Y., & Weathington, D. (2014). The acquisition of contextual cueing effects by persons with and without intellectual disability. Research in Developmental Disabilities, 35(10), 2341–2351, https://doi.org/10.1016/j.ridd.2014.05.026. [PubMed]
Milligan, G. W., & Cooper, M. C. (1987). Methodology review: Clustering methods. Applied Psychological Measurement, 11(4), 329–354, https://doi.org/10.1177/014662168701100401.
Moon, H., & Phillips, P. J. (2001). Computational and performance aspects of PCA-based face-recognition algorithms. Perception, 30(3), 303–321, https://doi.org/10.1068/p2896. [PubMed]
Ogawa, H., Takeda, Y., & Kumada, T. (2007). Probing attentional modulation of contextual cueing. Visual Cognition, 15(3), 276–289, https://doi.org/10.1080/13506280600756977.
Perruchet, P., & Vinter, A. (1998). PARSER: A model for word segmentation. Journal of Memory and Language, 39(2), 246–263, https://doi.org/10.1006/jmla.1998.2576.
Posner, M. I., & Keele, S. W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77(3, Pt.1), 353–363, https://doi.org/10.1037/h0025953. [PubMed]
Reber, P. J., Batterink, L. J., Thompson, K. R., & Reuveni, B. (2019). Implicit learning: History and applications. In Implicit learning (pp. 16–37). Routledge.
Rokach, L., & Maimon, O. (2005). Clustering Methods. In Maimon, O., & Rokach, L. (Eds.), Data mining and knowledge discovery handbook (pp. 321–352). Springer US, https://doi.org/10.1007/0-387-25465-X_15.
Rosenholtz, R. (1999). A simple saliency model predicts a number of motion popout phenomena. Vision Research, 39(19), 3157–3163, https://doi.org/10.1016/S0042-6989(99)00077-2. [PubMed]
Schankin, A., & Schubö, A. (2009). Cognitive processes facilitated by contextual cueing: Evidence from event-related brain potentials. Psychophysiology, 46(3), 668–679, https://doi.org/10.1111/j.1469-8986.2009.00807.x. [PubMed]
Shepard, R. N. (1958a). Stimulus and response generalization: Deduction of the generalization gradient from a trace model. Psychological Review, 65(4), 242–256, https://doi.org/10.1037/h0043083. [PubMed]
Shepard, R. N. (1958b). Stimulus and response generalization: Tests of a model relating generalization to distance in psychological space. Journal of Experimental Psychology, 55(6), 509–523, https://doi.org/10.1037/h0042354. [PubMed]
Thomas, C., Didierjean, A., Maquestiaux, F., & Goujon, A. (2018). On the limits of statistical learning: Intertrial contextual cueing is confined to temporally close contingencies. Attention, Perception, & Psychophysics, 80(6), 1420–1435, https://doi.org/10.3758/s13414-018-1519-6. [PubMed]
Tulving, E., & Schacter, D. L. (1990). Priming and human memory systems. Science, 247(4940), 301–306, https://doi.org/10.1126/science.2296719. [PubMed]
Turk-Browne, N. B., Yi, D.-J., & Chun, M. M. (2006). Linking implicit and explicit memory: Common encoding factors and shared representations. Neuron, 49(6), 917–927, https://doi.org/10.1016/j.neuron.2006.01.030. [PubMed]
Vaskevich, A., & Luria, R. (2018). Adding statistical regularity results in a global slowdown in visual search. Cognition, 174, 19–27, https://doi.org/10.1016/j.cognition.2018.01.010. [PubMed]
Vaskevich, A., & Luria, R. (2019). Statistical learning in visual search is easier after experience with noise than overcoming previous learning. Visual Cognition, 27(5-8), 537–550, https://doi.org/10.1080/13506285.2019.1615022.
Zinchenko, A., Conci, M., Müller, H. J., & Geyer, T. (2018). Predictive visual search: Role of environmental regularities in the learning of context cues. Attention, Perception, & Psychophysics, 80(5), 1096–1109, https://doi.org/10.3758/s13414-018-1500-4. [PubMed]
Figure 1.
 
Schematic of clustering learning. In the beginning of the visual search task, the cognitive system samples search scenes and categorize items into spatially isolated clusters (top). After acquiring clustering knowledge, each item in a new scene will be classified into its respective cluster so that the specific scenes were abstracted into layouts (bottom). The potential range of appearances of items under different variability conditions is depicted in the right panel by light gray shading.
Figure 1.
 
Schematic of clustering learning. In the beginning of the visual search task, the cognitive system samples search scenes and categorize items into spatially isolated clusters (top). After acquiring clustering knowledge, each item in a new scene will be classified into its respective cluster so that the specific scenes were abstracted into layouts (bottom). The potential range of appearances of items under different variability conditions is depicted in the right panel by light gray shading.
Figure 2.
 
Diagram of visual search scenes. The search space is uniformly divided into an 8 × 6 checkerboard grid. Each search scene comprises 12 distractors and 1 target randomly distributed across the grid. The target is a rotated “T,” and the distractors are rotated “L's,” all composed of the same line segments. The potential range of appearances of items under different variability conditions is depicted in the right panel by light gray shading. Under high variability conditions, items randomly appear within a larger range across grid squares, whereas under low variability conditions, items appear within a smaller range within grid squares.
Figure 2.
 
Diagram of visual search scenes. The search space is uniformly divided into an 8 × 6 checkerboard grid. Each search scene comprises 12 distractors and 1 target randomly distributed across the grid. The target is a rotated “T,” and the distractors are rotated “L's,” all composed of the same line segments. The potential range of appearances of items under different variability conditions is depicted in the right panel by light gray shading. Under high variability conditions, items randomly appear within a larger range across grid squares, whereas under low variability conditions, items appear within a smaller range within grid squares.
Figure 3.
 
Results of Experiment 1. The horizontal axis represents the time course. Blue diamonds represent reaction times (RT) for novel displays, and orange diamonds represent RT for repeated displays. Error bars indicate the standard error of the mean.
Figure 3.
 
Results of Experiment 1. The horizontal axis represents the time course. Blue diamonds represent reaction times (RT) for novel displays, and orange diamonds represent RT for repeated displays. Error bars indicate the standard error of the mean.
Figure 4.
 
Results of Experiment 2. The horizontal axis represents the time course. The light yellow background represents the prelearning stage, and the white background represents the contextual cue learning stage. The top represents the condition of presenting scenes with high variability during the prelearning stage, and the bottom represents the condition with low variability during the prelearning stage. Blue diamonds represent RT for novel displays, and orange diamonds represent RT for repeated displays. Error bars indicate the standard error of the mean.
Figure 4.
 
Results of Experiment 2. The horizontal axis represents the time course. The light yellow background represents the prelearning stage, and the white background represents the contextual cue learning stage. The top represents the condition of presenting scenes with high variability during the prelearning stage, and the bottom represents the condition with low variability during the prelearning stage. Blue diamonds represent RT for novel displays, and orange diamonds represent RT for repeated displays. Error bars indicate the standard error of the mean.
Figure 5.
 
Post hoc results of Experiment 2. Post hoc revealed no significant difference in RT between the novel layout and repeated layout for the low variability condition (p = 0.431, Bonferroni corrected). However, in the high variability condition, the reaction time for the novel layout was significantly longer than that for the repeated layout (p < 0.001, Bonferroni corrected), revealing a pronounced contextual cue effect.
Figure 5.
 
Post hoc results of Experiment 2. Post hoc revealed no significant difference in RT between the novel layout and repeated layout for the low variability condition (p = 0.431, Bonferroni corrected). However, in the high variability condition, the reaction time for the novel layout was significantly longer than that for the repeated layout (p < 0.001, Bonferroni corrected), revealing a pronounced contextual cue effect.
Table 1.
 
Comparison of reaction times between novel and repeated layouts in Experiment 1. Notes: The significantly faster reaction times for the novel layouts compared with the repeated scenes imply the presence of contextual cueing effect. The statistical method is paired-sample t-test.
Table 1.
 
Comparison of reaction times between novel and repeated layouts in Experiment 1. Notes: The significantly faster reaction times for the novel layouts compared with the repeated scenes imply the presence of contextual cueing effect. The statistical method is paired-sample t-test.
Table 2.
 
Comparison of reaction times (RT) between two variability conditions in the prelearning stage of Experiment 2. Notes: This stage comprises visual search scenes with novel layouts only. The statistical method is independent-samples t-test.
Table 2.
 
Comparison of reaction times (RT) between two variability conditions in the prelearning stage of Experiment 2. Notes: This stage comprises visual search scenes with novel layouts only. The statistical method is independent-samples t-test.
Table 3.
 
Comparison of RT between novel and repeated layouts in the contextual cue learning stage of Experiment 2. Notes: The significantly faster reaction times for the novel layouts compared with the repeated scenes imply the presence of contextual cueing effect. The statistical method is paired-sample t-test.
Table 3.
 
Comparison of RT between novel and repeated layouts in the contextual cue learning stage of Experiment 2. Notes: The significantly faster reaction times for the novel layouts compared with the repeated scenes imply the presence of contextual cueing effect. The statistical method is paired-sample t-test.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×