Abstract
There is no consensus as to how to characterize eye fixations during visual search. On the one hand, J. M. Wolfe, G. A. Alvarez, and T. S. Horowitz (2000) have described them as a haphazard sequence of fixations. On the other hand is research that shows systematic repetition of visual patterns when freely viewing a scene (T. Foulsham & G. Underwood, 2008; D. Noton & L. W. Stark, 1971a). Two experiments are reported that demonstrate the repetition and adaptation of visual scans during visual search, supporting an adaptive scanning hypothesis. When trials were repeated in a simple search task, visual scan similarity and search efficiency increased. These increments in similarity and efficiency demonstrate the systematic and adaptive nature of visual scans to the characteristics of the visual environment during search.
Visual scans were operationally defined as the sequence of fixated items that occur from the time a search display is presented until a participant responds to the target and included all fixations on target and distractor items. Where scans have been defined as a specific number of fixations for determining scan repetition (c.f., Foulsham & Underwood,
2008), the current operational definition provided the opportunity for the number of fixations to change with experience.
Arbitrary identifiers (i.e., letters) were assigned to display item locations. For each search display, distractor and target locations were assigned an identifier unique to the search display. Thus, repeating displays had identical letter identifiers for target and distractor locations. In novel displays, distractor identifiers were randomly assigned an identifier and the target was assigned “T.” The use of identifiers instead of display coordinates reduced the complexity of the algorithm from two comparisons for determining if two fixated items are at the same location (one for the x and one for the y display coordinates) to making one comparison (between letter identifiers). The similarity results from the novel displays served as a control for comparing results from repeating displays.
To determine the degree of similarity between two visual scans, the Levenshtein (
1966) distance algorithm was used. The algorithm determines the minimum number of insertions, deletions, and replacements necessary to change one scan into another. To demonstrate how the Levenshtein algorithm works with letter identifiers, take the comparison between the two visual scans: FIREMEN and POLICEMEN, where each letter represents a fixated item, and repeated letters represent a second fixation on an item (i.e.,
refixation). To change “FIREMEN” into “POLICEMEN,” the solution would be (1) to insert a “P” to the left of the “F”; (2) to insert “O” to the left of the “F”; (3) to replace “F” with “L,” and (4) to replace “R” with “C,” resulting in minimum-edit distance of 4. Hence, similarity is based on the
sequence of fixated items rather than the proportion of items fixated. For example, two scans that contained the same fixated items but in a different sequence (i.e., ABCDT and DBCAT) would be as similar as two scans that contained no common fixated items other than the target (i.e., GHEJT and ABCDT).
A visual scan was defined as the first fixated item during a trial to the last fixated item at the trial response. Analyzed visual scans included fixations and refixations on distractors and targets. Target fixations were not excluded, as they too are part of the visual scan from a search trial, could be refixated during search, and there was no requirement to fixate the target before or during a response.
Minimum-edit distances are normalized to control for differences in lengths of compared visual scans (Foulsham & Underwood,
2008; Josephson & Holmes,
2002). The normalized minimum-edit distance is then subtracted from one to obtain the normalized similarity index, or
NSI where
MED is the minimum-edit distance and
S longest is the length of the longer of the two compared scans.
There is an inverse relationship between the NSI metric and the length of visual scans, such that as scan lengths decrease NSI values increase. Monte Carlo simulations were performed to ensure that NSI values obtained from novel displays approximated the similarity from two random scans, providing a second control for increases in the NSI metric from repeating and novel displays. The Monte Carlo simulations produce a “special case” of the anarchy hypothesis. Where the anarchy hypothesis would not predict a reduction in the number of fixations to find the target, the simulations are based on human data that may demonstrate a reduction of fixations with experience. Consequently, the Monte Carlo simulation results can be considered a conservative prediction of visual scan similarity based on the anarchy hypothesis.
To obtain NSI values from random sequences for comparison to human NSI values from repeating search displays, sequences must be approximately the same length as the human visual scans. To obtain a random sequence of appropriate length, visual scan lengths from novel search displays for each epoch were first obtained from the human data. Second, two random samples of visual scan length, L 1 and L 2, were sampled from the human data without replacement because the number of refixations in the human data was assumed to be very low.
After obtaining the two scan lengths, they were used in constructing the random sequences. The number of fixations for each length was reduced by one, L X − 1, and items were randomly sampled without replacement to produce a random sequence. Once the random sequence was produced, a target identifier was added to the end of the sequence and was based on the simplifying assumption that participants fixated the target when they responded. This was done for L 1 and L 2, where the first sequence, S 1, had a length of L 1, and the second sequence, S 2, had a length of L 2. These steps effectively produced pseudorandom scans. The scans were not completely random because items were sampled without replacement. After producing the two pseudorandom sequences, their similarity was computed using the NSI metric.
First, accuracy results are presented to demonstrate that participants are successfully completing the task and are followed by search efficiency analyses. Finally, scan similarity results are presented.
Trial accuracy. A 2 (search display type) × 20 (block) repeated measures analysis of variance (ANOVA) was conducted to determine if there were systematic differences in response accuracy. The dependent variable was the proportion of correct trials, per block. Block and the Search Display Type × Block interaction violated sphericity and the Greenhouse–Geisser correction was used.
The mean proportion of correct responses was high: 0.98. There was not a significant main effect of search display type, F(1, 37) < 0.001, p > 0.99, NS, indicating no difference in response accuracy between repeating and novel search displays. There was not a significant main effect of block, F(10.52, 389.29) = 1.21, p > 0.28, NS, nor did search display type interact with block, F(10.9, 403.19) = 0.69, p > 0.73, NS.
Efficiency analyses. To determine if participants' search efficiency increased, trial response times and the number of fixations and refixations to find the target were analyzed across blocks of the experiment. We hypothesized that the number of fixations would decrease with increased experience of repeating search trials. A failure to reject the hypothesis suggests that participants are increasing search efficiency, whereas a rejection of the hypothesis supports the general strategy hypothesis.
Response times. To determine if response times differed as a function of search display type or experience, a 2 (search display type) × 20 (block) repeated measures ANOVA was conducted. Block and the Search Display Type × Block interaction violated sphericity and the Greenhouse–Geisser correction was used.
There was a significant main effect of block, F(7.31, 255.7) = 25.92, p < 0.001), where response times were gradually reduced from a mean of 1287.2 ms in block 1 to a mean of 943.6 ms by block 20. There was not a main effect of search display type, F(1, 35) = 0.12, p > 0.73, NS, nor was there a significant Search Display Type × Block interaction, F(9.9, 346.61) = 0.57, p > 0.83, NS.
Fixation count. Fixations were determined using a sample-based fixation algorithm (see Myers & Schoelles,
2005, for a full description of the algorithm). Once a fixation was calculated, the closest display item within 2° of visual angle was assigned to the fixation. Concurrent fixations on the same display item were aggregated into a single fixation. If there was not a display item within 2° of visual angle, “middle of nowhere” was assigned to the fixation. Fixation sequences (i.e., visual scans) were determined for each trial.
To determine if the number of fixations to find the target was reduced across blocks, a 2 (search display type) × 20 (block) repeated measures ANOVA was performed on the mean number of fixations on stimulus items per block. Block and the Search Display Type × Block interaction violated sphericity and the Greenhouse–Geisser correction was used. The average number of fixations was 2.86.
There was not a main effect of search display type, F(1, 35) = 0.485, p > 0.49, NS. Importantly, there was a main effect of block, F(4.39, 153.76) = 12.00, p < 0.001, demonstrating that the mean number of fixations to find the target was reduced with experience (M Block-1 = 3.38; M Block-20 = 2.38). There was not a significant Search Display Type × Block interaction, F(9.67, 348.9) = 0.99, p > 0.45, NS.
The average number of refixations was submitted to a 2 (search display type) × 20 (block) repeated measures ANOVA. There were very few refixations, less than 0.03 on average. However, there was a main effect of block, F(19, 342) = 2.8, p < 0.001, where the number of refixations was reduced across blocks (M Block-1 = 0.08; M Block-20 = 0.02).
The results demonstrate that participants increased their search efficiency across blocks—accuracy was maintained at a high proportion of correct trials (0.98) while the amount of time to find the target decreased by 343.6 ms from the first to the last block.
Scan similarity analyses. The following analysis was conducted to determine if visual scans increase in similarity during visual search, reflecting an established and regularly used method for finding targets in repeating search displays. We hypothesized that visual scans would increase in similarity with increased experience of repeating search trials. A failure to reject the hypothesis suggests that participants are developing skillful search through repeating displays, whereas if visual scans do not increase in similarity, then there is support for the anarchy hypothesis.
Visual scans were aggregated into epochs of blocks, where one epoch equaled five blocks. For each participant, trials from each of the repeating search displays within each epoch (5 views of the same repeating search display) were compared against each other producing 10 NSI values for each repeating search display. The 10 NSIs were then averaged to obtain the mean epoch NSI for each of the 12 repeating search displays. Next, the mean NSI for each epoch was averaged across repeating displays to acquire the average repeating search display NSI for each epoch of trials.
In order to determine if the similarity of visual scans across repeating search displays is greater than predicted by chance, NSI values for scans from novel search displays were also determined. Novel displays were only compared with other novel displays that shared target location. This produced a chance NSI value with which to compare NSI values from repeating displays.
To determine if there were differences in NSI values between repeating and novel search displays across epochs, a 2 (search display type) × 4 (epoch) repeated measures ANOVA was performed on all mean NSI values. Epoch violated the sphericity assumption and corresponding results use the Greenhouse–Geisser correction.
There was a main effect of search display type, F(1, 35) = 106.08, p < 0.001, where visual scans from repeating search displays (M Repeating = 0.67) were significantly more similar than novel ones (M Novel = 0.60). There was also a main effect of epoch, F(2.18, 69.79) = 36.13, p < 0.001, demonstrating an increase in similarity across epochs. However, there was not a significant Search Display Type × Epoch interaction, F(3, 99) = 0.78, p > 0.49, NS.
The above analyses did not reveal evidence that visual scans from repeating displays increased in similarity at a faster rate than the similarity of scans from novel displays. Post hoc analyses using the Bonferroni correction revealed that there were significant differences in NSI values between repeating and novel search displays within the first epoch (p < 0.001). Hence, it may be the case that differences in visual scans do increase faster for repeated than for novel displays, but that this increase occurs rapidly within the first epoch.
To investigate the likelihood of a rapid differential onset of similarity within the first six views of repeating displays, we derived five consecutive scan comparisons for repeated displays and for novel displays with the same target location (i.e., S 1 vs. S 2, S 2 vs. S 3, S 3 vs. S 4, S 4 vs. S 5, S 5 vs. S 6). The NSI values from these comparisons were used in a 2 (search display type) × 5 (consecutive scan) repeated measures ANOVA. This analysis yielded a significant main effect of search display type, F(1, 35) = 4.69, p = 0.037. However, it did not show a main effect of consecutive scan, F(3, 105.14) = 0.22, p > 0.88, nor was there a reliable Search Display Type × Consecutive Scan interaction, F(4, 140) = 0.20, p > 0.93, NS.
Monte Carlo simulations. To produce pseudorandom NSI values across epochs, 10 NSI values were determined for the pseudorandom sequences, effectively reproducing the same number of NSI values from novel search displays in the experiment. This was repeated for each epoch. To mimic the reduction of human scan lengths across epochs of the experiment, scan lengths from the first epoch of
Experiment 1 were only used for the first epoch of the pseudorandom sequence comparisons, scan lengths from the second epoch of
Experiment 1 were only used for the second epoch of the random sequence comparisons,
et cetera. As was the case for the human generated visual scans, the 10 NSI values were then averaged to obtain the mean epoch NSI for each of the four epochs. This process provided the same number of comparisons as that from a single human participant from the experiment. To minimize variability, the Monte Carlo simulation described above was run 200,000 times (10,000 times for each participant in the experiment, see
Figure 1).
As mentioned earlier, the Monte Carlo simulation results provide a special case of the anarchy hypothesis because the number of fixations composing a scan was reduced across epochs of trials in the simulations. Consequently, the simulation results can be interpreted as a conservative estimate of scan similarities predicted by the anarchy hypothesis.
First, the results from the Monte Carlo simulations indicate that as the number of fixations to find the target are reduced with experience, NSIs gradually increase in similarity, much as visual scans from novel trials. Second, the results from the Monte Carlo simulations indicate that human visual scans are more systematic than predicted by the anarchy hypothesis, even scans from novel displays. A possible explanation is that participants may have learned the target locations associated with novel displays. An alternative explanation is that the participants are learning a more general search skill associated with non-stable search displays but that cannot be specialized as within stable search displays, as suggested by the general strategy hypothesis. Further, participants' search times decrease for novel displays (Chun & Jiang,
1998; Chun,
2000), demonstrating that participants are learning something general enough to apply to novel search displays. At the very least, NSI values obtained from the novel search displays highlights the ability of the human visual search process to adapt to the environment.
Results are presented in the same order as
Experiment 1: accuracy analyses, efficiency analyses, and scan similarity analyses.
Letter classification accuracy. A 3 (configuration group) × [2 (search display type) × 30 (block)] mixed ANOVA was conducted on participants' accuracy of the auditory letter classification task in the dual-task condition. There were no significant effects, and accuracy was an adequate 78% correct, on average.
Visual search accuracy. A 2 (transfer-load) × 3 (configuration group) × [2 (search display type) × 30 (block)] mixed ANOVA was conducted on trial accuracy. The dependent variable was the proportion of correct trials.
Accuracy for the search task was high, never falling below 91% correct across blocks of the experiment. There was a significant main effect of search display type, F(1, 22) = 9.344, p = 0.006, where novel search displays resulted in a higher mean proportion of correct trial responses (M Novel = 0.973) than repeating displays (M Repeating = 0.968). This result demonstrates a difference in accuracy by search display type, where novel stimuli are responded to more accurately than repeating stimuli. Although there is a significant effect, the difference between the conditions (0.005) is trivial and has little bearing on the hypotheses being tested as the response accuracy for both search display types were very high and remained high throughout the experiment. There was also a Search Display Type × Load × Configuration Group interaction, F(2, 22) = 3.867, p = 0.036, where one of the configuration groups led to reduced accuracy for repeating and novel search displays in the single-task condition but had little effect on repeating or novel displays in the dual-task condition. Finally, there was a Significant Block × Load interaction, F(29, 638) = 2.58, p < 0.001, where the dual-task group increased accuracy at a faster rate across blocks than the single-task group. These results demonstrate that participants were focused on finding the target and accurately responding and that the concurrent letter classification task negatively affected search response accuracy.
Efficiency analyses. To determine if participants' search efficiency increased, trial response times and the number of fixations and refixations to find the target were analyzed across blocks of the experiment. Just as
Experiment 1, we hypothesized that the number of fixations would decrease with increased experience of repeating search trials. A failure to reject the hypothesis suggests that participants are increasing search efficiency, whereas a rejection of the hypothesis supports the general strategy hypothesis.
Visual search response times. A 2 (load) × 3 (configuration group) × [2 (search display type) × 30 (block)] mixed ANOVA was conducted on response latency. There was a main effect of block, F(29, 609) = 12.84, p < 0.001, where response times decreased with experience. The single-task group had an average response time of 2197.13 ms in the first block and an average response time of 1657.43 ms in the 30th block. The dual-task group had an average response time of 4363.68 ms in the first block and an average response time of 1971.2 ms in the 30th block. There was also a main effect of load, F(1, 21) = 10.144, p = 0.004, where the dual-task group had longer response times than the single-task group. There was a significant Block × Load interaction, F(29, 609) = 3.33, p < 0.001, where the dual-task group reduced trial response latencies across blocks at a faster rate than the single-task group. No other effects reached significance.
If participants were increasing search efficiency, then the number of fixations to find a target should decrease across blocks of trials. A 2 (load) × 3 (configuration group) × [2 (search display type) × 30 (block)] mixed ANOVA was performed on fixations on display items. There was a main effect of load, where the dual-task group had more fixations than the single-task group, F(1, 20) = 10.16, p = 0.005. There was a significant main effect of block, where the number of fixations decreased with task experience, F(29, 580) = 17.5, p < 0.001. There was also a significant Block × Load interaction, F(29, 580) = 3.43, p < 0.001, where the number of fixations to find the target was reduced with experience and cognitive load (M Dual-Block-1 = 9.33; M Dual-Block-30 = 5.44; M Single-Block-1 = 6.43; M Single-Block-30 = 4.49). Furthermore, there was a simple main effect for the single-task group where the number of fixations were significantly reduced across blocks, F(29, 319) = 1.61, p < 0.05. There was not a significant effect of search display type, F(1, 20) = 0.91, p > 0.35, nor was there a Significant Load × Search Display Type interaction, F(1, 20) = 0.35, p > 0.56. No other effects reached significance.
The average number of refixations was submitted to 2 (load) × 3 (configuration group) × [2 (search display type) × 30 (block)] mixed ANOVA. There were very few refixations, less than 0.33 on average, yet there were more than
Experiment 1. There was a main effect of block,
F(29, 609) = 13.12,
p < 0.001, where the number of refixations was reduced across blocks. There was a main effect of load,
F(1, 21) = 19.77,
p < 0.001, where the dual-task group had more refixations on average per block of trials (
M Dual = 0.43) than the single-task group (
M Single = 0.24). Not surprisingly, there was a Block × Load interaction,
F(29, 609) = 4.39,
p < 0.001, where the dual-task group reduced refixations across blocks at a faster rate than the single-task group.
The results demonstrate that participants increased their search efficiency across blocks—accuracy was maintained at a high proportion of correct trials (0.97) while the amount of time to find the target decreased by 539.7 ms from the first to the last block in the single-task condition and was decreased by 2392.5 ms in the dual-task group.
Scan similarity analyses. The similarity analysis was conducted to determine if visual scans increased in similarity during visual search, reflecting an established and regularly used method for finding targets in repeating search displays. We hypothesized that visual scans would increase in similarity with increased experience of repeating search trials. A failure to reject the hypothesis suggests that participants are repeating scans to find the target, whereas if visual scans do not increase in similarity then there is support for the anarchy hypothesis. The NSI metric used in
Experiment 1 was also used in
Experiment 2. Monte Carlo simulations were run to demonstrate a special case of the anarchy hypothesis and to provide a control along with the novel search displays.
To determine if there were differences in NSI values as a function of load, configuration group, search display type and epoch, a 2 (load) × 3 (configuration group) × [2 (search display type) × 6 (epoch)] mixed ANOVA was performed on all mean NSI values. There was a main effect of search display type,
F(1, 20) = 115.12,
p < 0.001, where visual scans from repeating search displays (
M Repeating = 0.49) were significantly more similar than visual scans from novel displays (
M Novel = 0.39). There was a main effect of epoch,
F(5, 100) = 29.29,
p < 0.001, demonstrating an increase in similarity across epochs. Importantly, there was a reliable search Display Type × Epoch interaction,
F(5, 100) = 5.40,
p < 0.001, demonstrating that visual scans from repeating search displays increased in similarity across epochs at a faster rate than visual scans from novel displays (see
Figure 2). There was also a main effect of load,
F(1, 20) = 10.55,
p < 0.001, revealing that the single-task group's mean NSI value (
M Single = 0.47) was significantly higher than the dual-task group's mean NSI value (
M Dual = 0.43).
The results of the Monte Carlo simulations replicated the results from
Experiment 1: The similarity of visual scans from novel and repeating search displays was greater than NSIs produced by pseudorandom sequences of comparable lengths to the participants' average scan length, again ruling out the anarchy hypothesis. These results suggest participants are learning something that leads to behavioral stability within the task environment. One possibility is that participants are learning the relatively small set of target locations used in both experiments. Another is that participants are learning a more general skill better distinguishing distractors and the target in periphery.
Underlying influences on visual scan similarity. Increased cognitive load on cognitive resources reduced the similarity of scans, demonstrating the importance of endogenous processes (i.e., memory, attention, skill acquisition, etc.) on the repetition of scans. However, NSI values from repeating displays in the dual-task condition remained higher than NSI values from novel displays, suggesting that stable environmental information provides a means for repeating visual scans when cognitive processes are taxed. The design of
Experiment 2 made it possible to determine if visual scan adaptation across epochs is solely a function of the visual stimulus. If the fixation locations composing a visual scan are determined from only external influences (e.g., the configuration of distractors and the target, visual salience, etc.) and if internal cognitive processes required for completing the task are invariant across individuals (i.e., memory, eye movement programming, etc.), then visual scans from different participants searching through the same repeating search displays,
between-participant NSI values, should be as similar to visual scans from the same person searching through repeating displays,
within-participant NSI values.
To determine between-participant NSI values, visual scans from participants searching through the same repeating displays were compared at each block for each of the 12 repeating trials, as well as for novel trials. For example, participant 1's visual scan from repeating-display-A was compared to participant 2's visual scan from repeating-display-A and then to participant 3's visual scan from repeating-display-A for each of the 12 repeating displays. Next, mean NSI values from each stimulus per block were averaged into epochs. Finally, all repeating stimuli were averaged together to get the mean repeating between-participant NSI at each epoch.
Although the within-participant and between-participant NSI values should not be submitted to statistical analyses because they contain the same data,
Figure 3 shows that between-participant NSI values from repeating search displays are more similar than between-participant NSI values from novel search displays and that the degree of between-participant similarity is less than the within-participant similarity. Because between-participant and within-participant NSI values were not equivalent, the results indicate that visual scans were not produced solely from exogenous influences but were produced from a mix of internal cognitive processes interacting with the structure of the environment as argued by Josephson and Holmes (
2002) and Foulsham and Underwood (
2008). These results further rule out the anarchy hypothesis. It remains unclear which internal cognitive processes are producing the differences in visual scans across participants operating with the same goal within the same environment and provides a clear and ripe area for future research.