Free
Research Article  |   February 2009
Learning cue validity through performance feedback
Author Affiliations
Journal of Vision February 2009, Vol.9, 18. doi:10.1167/9.2.18
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Jason A. Droll, Craig K. Abbey, Miguel P. Eckstein; Learning cue validity through performance feedback. Journal of Vision 2009;9(2):18. doi: 10.1167/9.2.18.

      Download citation file:


      © 2016 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements
Abstract

Targets of a visual search are often not randomly positioned within a scene, but may be more likely to co-occur adjacent to other objects or background properties. Studies on target-cue co-occurrence (e.g. cue validity) suggest that observers can exploit this knowledge to increase performance in detection and localization tasks. However, little is known regarding how observers learn this co-occurrence. The present experiment sought to determine if observers were capable of learning the probability of cue validity, and determine how this learning is shaped by feedback. Separate groups of subjects performed a search task using one of three different feedback conditions providing varying degrees of information: unsupervised feedback, response reinforcement, or supervised feedback. Results show that saccadic and perceptual decisions reflect larger cueing effects as feedback information increased. This suggests that internal signals generated from response selection are insufficient for exploiting cue validity, but that reinforcement may be sufficient. However, final explicit estimates of cue validity were independent of feedback condition, suggesting that implicit behaviors are subject to unique learning constraints. Comparison to an ideal observer reveals that the rate at which participants learned cue validity was suboptimal, which may have impaired performance during initial familiarization with scene statistics.

Introduction
Performance in visual search tasks is often facilitated by saccadic decisions, re-orienting our direction of gaze every few hundred milliseconds. These saccadic decisions, in turn, aid our final perceptual decision of detecting or localizing the sought after target. Saccadic and perceptual decisions are, of course, guided by visual sensory information, such as color and luminance. However, while gaze may be directed to locations that are salient in virtue of their stimulus properties, such as regions with the highest luminance or chromaticity (Itti & Koch, 2001), a visual system responding only in a reactive manner to elements in a scene would have no means of directing the observer to objects or areas of immediate interest or that were relevant for the task at hand. Thus, visual search is also be guided by top-down knowledge, such as memory for spatial configuration (Chun & Jiang, 1998; Peterson & Kramer, 2001), target properties (Eckstein, Beutter, & Stone, 2001; Findlay, 1997; Maljkovic & Nakayama, 1994; Rajashekar, Bovik, & Cormack, 2006; Rao, Zelinsky, Hayhoe, & Ballard, 2002; Tavassoli, van der Linde, Bovik, & Cormack, 2007), or estimates of where the target is likely to be found (Eckstein, Dresher, & Shimozaki, 2006; Maljkovic & Nakayama, 1996; Torralba, Oliva, Castelhano, & Henderson, 2006; Walthew & Gilchrist, 2006). This top-down knowledge cannot be innate, and therefore must either be instructed or learned through one's own experience. 
In the present paper, we sought to examine how observers learn the frequency with which a target stimulus is associated with the location of another stimulus. We specifically explore how learning target-cue co-occurrence, referred to as cue validity in the attention literature, influences observers' saccadic and perceptual decisions, and how this learning is guided by task feedback. 
Exploiting scene statistics to improve search performance
Despite demonstrations on the utility of exploiting cue validity (Eckstein, Pham, & Shimozaki, 2004; Eckstein, Shimozaki, & Abbey, 2002; Palmer, Ames, & Lindsey, 1993; Posner, 1980; Shaw & Shaw, 1977), little is known regarding how this statistic is learned by the observer. However, there is growing evidence that observers show improved search performance within statistically familiar scenes, and that this learning can occur within the course of a single experimental session. For example, observers exhibit reduced reaction times (Chun & Jiang, 1998, 1999) and increased target accuracy for saccades (Peterson & Kramer, 2001) during repeated presentations of distractor configurations. Observers also learn the probable spatial location of a target, evidenced by increased saccadic accuracy and decreased reaction times (Geng & Behrmann, 2002, 2005; Walthew & Gilchrist, 2006). Learning may also include non-spatial properties of a scene, such as the probability that an object will undergo a change, and this learning is manifest in the distribution of both saccades and perceptual decisions (Droll, Gigone, & Hayhoe, 2007). 
A unique feature in these demonstrations of learning is that because experimenters do not specify the underlying statistical structure, observers must learn this structure on their own through exploration. As observers explore a novel environment, they may regulate their propensity to sample different information from trial to trial, dynamically adjusting their estimates of scene statistics using recent observations and performance feedback (Daw, O'Doherty, Dayan, Seymour, & Dolan, 2006; Glimcher, 2003; Sugrue, Corrado, & Newsome, 2004, 2005; Yu & Dayan, 2005). Neurophysiology experiments have shown that this learning process may be controlled by cortical areas traditionally described as encoding eye movements or visual attention, such as lateral intraparietal area (Platt & Glimcher, 1999), frontal eye fields (Ding & Hikosaka, 2006; Roesch & Olson, 2003), basal ganglia (Hikosaka, Takikawa, & Kawagoe, 2000) and superior colliculus (Ikeda & Hikosaka, 2003). These same areas are also sensitive to task structure and reward variables, including performance feedback. Encoding task related variables includes not only object relevance (Goldberg, Bisley, Powell, Gottlieb, & Kusunoki, 2002), but also the history of reinforcement (Glimcher, 2003; Hikosaka et al., 2000; Platt & Glimcher, 1999; Sugrue et al., 2004; Watanabe, Lauwereyns, & Hikosaka, 2003), feedback outcome of a saccade task (Stuphorn, Taylor, & Schall, 2000), or the probable location of a saccade target (Basso & Wurtz, 1998). This sensitivity to task and reward structure suggests that saccadic and perceptual decisions are mediated by cortical areas that not only facilitate the learning of scene statistics but also the appropriate actions (Hayhoe & Ballard, 2005). 
Task feedback
An open question for research is an understanding of what information observers require to facilitate learning or exactly what computations are involved. One critical component to learning is the feedback received during a task (Seitz & Watanabe, 2005), especially within probabilistic environments with uncertain structure (Yu & Dayan, 2005). The type of feedback available in a task dictates at least three possible learning mechanisms, which differ from each other based on the degree to which the input (e.g. visual scene) is associated with the output (e.g. instructed location of target position). 
In unsupervised learning, the agent is presented only with input; no information is provided regarding how to assign particular instances of the input into categories of output, and chosen actions or decisions have no consequence on the output, or final state. Because there is no a priori framework with which to frame the incoming information, computations involved in this learning typically emphasize techniques of clustering, or generalizing categories from similar instances (Barlow, 1989). For example, bottom-up patterns of visual stimuli may direct the development of neural receptive fields, without being guided by top-down knowledge about the content of the scene. Unsupervised algorithms are capable of learning object categories (Fei-Fei, Fergus, & Perona, 2006), action perception (Niebles, Wang, & Fei-Fei, 2006), face recognition (Bartlett & Sejnowski, 1996), and cortical areas are capable of implementing this feed-forward learning (Serre et al., 2005). Observers are also capable of learning basic scene statistics with unsupervised feedback, such as object frequency and the joint probability of object location, as demonstrated by shape recognition following passive observation (Fiser & Aslin, 2001). However, because of the typically large number of examples required for unsupervised learning, it is not clear if this form of learning is sufficient to adapt to novel environments which require rapid adjustment to scene statistics. 
In reinforcement learning, the agent (e.g. observer or organism) is presented with feedback based on the quality of their decision in response to the present state of the world (Sutton & Barto, 1998). Feedback is typically a scalar reward variable, and the agent tries to maximize the total gain (or minimize loss). However, reward is typically only delivered for a correct choice; no reward is delivered for an incorrect choice. Thus, following a rewarded correct response, the agent can assign credit to their action, or improve the estimate to the state of the world, but following an incorrect response, it is unclear what information the agent should update. A computational challenge posed to the agent with reinforcement feedback is how to exploit knowledge of previously learned information, while simultaneously exploring other less optimal choices. Occasionally selecting suboptimal choices may incur more accurate estimates of the reward structure, as well as greater sensitivity to unexpected changes in this structure (Yu & Dayan, 2005). Reinforcement algorithms have successfully been used to model the learning of complex tasks in which the agent both continuously estimates the state of the world while concurrently optimizing a policy that dictates which of several available actions to take in response to each state (Sprague, Ballard, & Robinson, in press). 
Providing partial information in reinforcement feedback is in contrast to supervised learning, in which the agent is presented with a set of training data including both the input and the correctly paired output (e.g. a set of visual images, paired with the correct object label or target position). After having been presented with the training data, the algorithm (or agent) must then infer the correct output given novel input (Bishop, 1995; Duda, Hart, & Stork, 2000). In the context of a visual search task, supervised feedback can be provided by revealing the correct answer on each trial, regardless of the observer's decision or action. Thus, regardless of the behavioral response, observers are shown the correct pairing between the input (scene) and the output (perceptual decision). Thus, supervised feedback exposes an observer to the greatest number of correct input–output pairings, allowing for accurate estimates to potentially accrue more quickly than in reinforcement feedback alone. 
It is often not clear which of these three learning mechanisms is employed to improve performance in visual search. One reason why the role of feedback in search tasks is unclear is because experimenters often use reaction times as a dependent measure during tasks with few errors, such when discriminating the orientation of a target “T” among “L” distractors (Chun & Jiang, 1998; Geng & Behrmann, 2005). By using these highly discriminable stimuli, subjects are likely aware of their near-perfect performance and this self-monitoring is arguably a form of supervised feedback, or reinforcement feedback with no error trials. A second reason why identifying the learning mechanism in search is difficult is due to the typically few number of available alternative choices in experimental tasks. For example, in a two-alternative forced choice paradigm, providing reinforcement feedback effectively equates to supervised feedback, because when a reward is withheld following an incorrect decision, the correct answer can be inferred. 
Present study
In the present experiment, we sought to assess the role of feedback in a visual search task within a statistically structured scene. To this aim, we devised a task in which observers were instructed to localize, or reject, the presence of a contrast increment among dimmer distractors. By randomly varying the contrast with noise, and manipulating the probability with which targets appeared within colored cues, we were able to both ensure that subjects' performance was below ceiling, and also provided an opportunity for subjects to improve. By comparing performance in the same task using three different feedback conditions, we show that the learning of cue validity is facilitated by informative feedback, evidenced by observers' implicit distribution of saccadic and perceptual decisions. However, we also show that numeric estimates of cue validity are independent of feedback, suggesting the possibility of a separate learning mechanism for explicit knowledge. 
Methods
Task
Figure 1 shows the structure of the “localize or reject” task for each condition. Subjects initiated each trial by fixating a central dot and pressing the space bar. Following a fixation cross for a variable duration of 50–150 ms, the Test stimulus appeared for 2000 ms. Subjects were allowed to shift gaze when the Test stimulus was presented, although they were not explicitly instructed to do so. The task was to decide whether a target contrast increment was present in one of six Gaussian pedestals (see below), each of which appeared in the center of a colored circle (e.g., cue). Because the distribution of possible target and distractor contrast values were overlapping due to the contrast noise added to both, perfect detection and localization of the target was impossible, thus ensuring imperfect performance. Before each experiment, subjects were told that the target would appear in the scene in approximately half of all trials. Subjects were also informed that among these target-present trials, some cues were more likely to contain the target than other cues. However, subjects were not told the values of these probabilities, or their distribution. While these probabilities were initially unknown to the observers, learning which cue(s) would be more likely to contain the target could be applied to the task to improve performance. 
Figure 1
 
“Localize or Reject” task. Subjects initiated each trial by fixating a central dot and pressing the space bar. Test stimuli included six colored circles, randomly arranged, each containing a contrast increment (e.g. Gaussian dot). Subjects were instructed to determine if one of the dots within the colored circles was a bright target among a set of less bright distractors. Following a mask of white noise, subjects used the mouse to either click on the cue in which they thought the target appeared, or to click on the word “No” that appeared randomly in one of the four corners of the screen, if they thought no target was present. Targets were present in half of all trials, and among these target-present trials the target would more frequently appear within some cues than other cues. Three feedback conditions are shown.
Figure 1
 
“Localize or Reject” task. Subjects initiated each trial by fixating a central dot and pressing the space bar. Test stimuli included six colored circles, randomly arranged, each containing a contrast increment (e.g. Gaussian dot). Subjects were instructed to determine if one of the dots within the colored circles was a bright target among a set of less bright distractors. Following a mask of white noise, subjects used the mouse to either click on the cue in which they thought the target appeared, or to click on the word “No” that appeared randomly in one of the four corners of the screen, if they thought no target was present. Targets were present in half of all trials, and among these target-present trials the target would more frequently appear within some cues than other cues. Three feedback conditions are shown.
Following the test image, a mask of white noise was presented for 500 ms. This was followed by a Response screen, which presented the same arrangement of cues shown with the test image, and the word “No” in one of the four randomly chosen corners of the screen. Subjects then used the mouse to either click on the cue in which they thought the target had appeared, or to click on the word “No” if they thought no target had been present. 
To familiarize themselves with the procedure and visual task, subjects first performed 50 practice trials, using the same average contrast values for targets and distractors, although with smaller standard deviation (see below) and with no cues during the test. Performing the practice task gave observers an opportunity to learn the task and procedure of the experiment and allowed for general performance improvements unrelated to the probability structure of the cues. The intention of the practice was to deplete the general task/procedural learning so that performance changes during the experimental conditions would isolate performance improvements related to learning cue validity. Following the practice trials, subjects then performed 300 experimental trials as described above. 
Following the search task, we assessed subjects' explicit knowledge of cue validity. Subjects were shown a horizontal array of the six cues. Below each cue was a cell into which subjects were instructed to type a decimal fraction, representing the fraction of trials with which that cue contained the target, on target-present trials. Each subject generated a set of six estimates whose total was automatically displayed, ensuring that the sum of this set of numbers equaled one. 
Stimuli
Targets were generally brighter than distractors (29.80 vs. 21.96 cd/m 2 peak luminance) and were created by adding a contrast increment to a Gaussian pedestal (standard deviation = 10 pixels; 0.41°). The contrast values of the target (pedestal plus increment) and distractors (pedestal only) were perturbed independently with contrast noise (standard deviation = 3.14 cd/m 2). Practice trials used the same mean contrast values for target and distractors as were used during the actual experiment, although the standard deviation of these values was smaller during the practice trials than experimental trials (1.57 vs. 3.14 cd/m 2). Mean contrast values and their standard deviations were derived from pilot experiments. 
During the three hundred experimental trials, Gaussian pedestals were located within a colored circle (i.e., cue) with a radius of 30 pixels (1.23°). Cues were equidistantly positioned along the circumference of a circle with a radius of 220 pixels (8.93°). The alignment of this array was randomly oriented on each trial, as well as the position of each cue within the array. Thus, any observed cueing effects could be attributed to the color of the cues, and not their arbitrary spatial arrangement or position, reconfigured on each trial. During the test stimulus, the presence of a target within the scene was assigned with 50% probability. Among these target-present trials, the placement of the target within one of the cues was also determined randomly but governed by the following probabilities: P(Target ∣ cue) = 0.6, 0.2, 0.1, 0.1, 0, 0). Again, subjects were told that some cues were more likely to contain the target than other cues, but the probability values, or the distribution of these values, were not revealed. Assignment of cue color and cue validity was counterbalanced across subjects. To counterbalance any stimulus set effects, the same twenty-one sets of three hundred test stimuli were used for each of the three different groups of twenty-one subjects in the three experimental conditions. Within each condition, each of the twenty-one subjects was shown one of these stimulus sets. Stimuli were generated using Matlab and the Psychophysics toolbox (Brainard, 1997; Pelli, 1997) and were displayed using Eyelink software. 
Experimental conditions
There were three experimental conditions. In Condition 1, subjects' response was followed by a gray screen, with no information revealing the correctness of their response, or target presence or position. Thus, performance was unsupervised, in the sense that there was no external feedback signal with which to direct or correct learning of cue validity. Learning could only occur through subjects' own internal signal of response selection, or extracting the statistics of cue validity through the visual information presented during the Test. 
In Condition 2, subjects' response was followed by the word “Correct” or “Incorrect”, accurately depicting the outcome of that trial. Thus, while correct responses were reinforced, feedback was only partially informative because incorrect responses did not reveal which of the remaining choices would have been correct. 
In Condition 3, subjects were given complete feedback at the end of each trial. Immediately following the subjects' response, the previous test stimulus was shown again, in addition to information on what the correct answer would have been. If the target had been present, a white circle appeared around the cue and contrast increment that represented the target. If no target had been present, the words “No Target” appeared in the center of the screen. Thus, performance in the task was considered supervised, in the sense that the feedback included both reinforcement for correct answers, or providing information on the correct answer following an incorrect response. 
Monitoring gaze
Gaze direction of the left eye was monitored throughout each trial on a separate PC with an Eyelink eye tracker by SMI/SR Research at a rate of 250 Hz. Saccades were detected using a velocity threshold of 35 deg/s and acceleration threshold of 9500 deg/s/s. The experimenter calibrated the eye tracker by having subjects fixate nine dots at various locations equally distributed across the screen. Calibration was considered accurate if the average error was less than one degree and did not exceed 2.0 deg at any of the nine calibration positions. Minor head movements were easily accommodated by the eye tracker and were minimized by a chin rest. After initiating each trial by fixating the central dot and pressing the space bar, the eye tracker performed any correction in drift. Subjects were occasionally recalibrated midway through the experiment, although this was not always necessary. Saccadic selection of cues was determined by calculating the minimum distance between the location of the first fixation and each of the six cues. First fixations in a trial that remained within a 70 pixel (2.87°) radius of the center were not assigned to a cue. Subjects were not made aware of the investigators' interest in eye movements in relation to the purpose of the experiment, and were not advised to use their gaze in any particular manner for the task. 
Subjects
Three separate groups of twenty-one subjects each participated in one of the three feedback conditions described above. Each experimental session lasted approximately 90 minutes. Subjects were recruited among undergraduate Psychology classes at University of California Santa Barbara in accordance with guidelines outlined by the UCSB Office of Research and were given class credit for their participation. 
Ideal observer analysis
In addition to measures of percent correct and choice frequency, human performance was assessed by comparison to an ideal observer. The purpose of this comparison was to elucidate the degree to which humans were capable of extracting the statistic of cue validity and applying this information to their perceptual decisions. The ideal Bayesian observer uses all possible information available in the images, including precise contrast values for each stimulus and knowledge of the distributions of contrast values from which distractors and targets are generated. This information allows the ideal observer to estimate the likelihood that each of the six contrast values were selected from each distribution, and this likelihood is weighted by an estimate of cue validity, which can be derived from prior experience. If a stimulus location has a sufficient weighted likelihood value (indicating a target present trial), and is the maximum value among the candidate locations, that stimulus (and cue) is selected as the perceptual decision. Details of the ideal observer are developed in 1
Developing an optimal learning algorithm for an observer experiencing unsupervised or reinforcement feedback is currently an unsolved problem, due to the large number of possible outcomes necessary to consider when each trial outcome is uncertain. However, the ideal updating algorithm for supervised feedback is tractable and is developed in detail in 1. Briefly, the ideal observer begins with knowledge of target-absent probability of 50%, and begins the task with an equal prior on each of the six cues for target-present trials. Throughout the course of the three hundred trials, the number of instances the target appears in each cue is tallied, and values of cue validity used for decisions in successive trials is the ratio of the number of instances each cue included the target to the total number of target present trials encountered. 
We also explored two other ways to assign the values of cue validity to be used as likelihood weights. First, with a perpetually naive ideal observer, values of cue validity are declared equal across all cues (e.g. flat priors) and do not alter with experience. Second, with a correctly informed ideal observer, the values of cue validity are initially set to reflect the true distribution of cue validity, and also do not alter with experience. In the first few initial trials, the ideal observer will perform similarly to the naive observer since it does not have enough experience to substantially modify its estimates of cue validity. After many trials, the learning ideal observer will asymptote to the informed observer as its estimates of cue validity improve. Thus these observers serve as upper and lower bounds for the effect of prior experience on ideal observer performance. 
Rather than directly comparing human and ideal observer performance through measures of percent correct, the performance measure of merit is the efficiency of the human observer (Barlow, 1980; Eckstein, Abbey, Pham, & Shimozaki, 2004). Here, efficiency is evaluated as the ratio of the squared contrast threshold required for the ideal and human observers to obtain the performance level (percent correct) of the human observer in each trial block, t. 
Efficiency=nt=cideal,t2/chuman2
(1)
Because the contrast threshold is fixed for the human observers in each trial block, as human performance (percent correct) changes throughout the experiment, the contrast threshold required for the ideal observer to match human performance adjusts accordingly (see 1 for details). 
Results
In each experimental condition, human search performance was assessed using two measures: saccadic and perceptual decisions. Saccadic decisions refer to the cue that was closest to the first fixation (Findlay, 1997), and perceptual decisions refer to the final decision indicated by the mouse click at the end of each trial (cue selection or “No Target”). Recall that subjects were not given any instruction on how to use gaze during each trial, but only to use the mouse to indicate their perceptual decision on the presence and location of the target. To prevent subjects from adopting an artificial search strategy, subjects were not told that their first saccades would be evaluated in a manner similar to their perceptual decisions at the end of each trial. Because subjects did not receive instructions on where to direct their gaze, subjects' eye movements may provide insights into mechanisms of attentional selection, as well as the natural strategies observers use when accumulating evidence for perceptual decisions. 
To assess whether observers' performance changed throughout the course of the experiment, we first analyzed saccadic and perceptual decisions with regard to general performance, including percent correct, efficiency, and overall rate of hits and false alarms, in each of five blocks of sixty trials. We next assessed whether any changes in performance might accompany changes in the rate of hits and false alarms for each cue type. Finally, we report subjects' explicit estimates of cue validity following each session to test for explicit learning. 
General performance
Percent correct
Percent correct includes both target-present trials in which the target was correctly localized, and target-absent trials in which the presence of a target was correctly rejected. Figures 2A and 2B plot the percent correct for saccadic and perceptual decisions for each of the three feedback conditions in each block of sixty trials. In this and all other figures, error bars represent one SEM across observers. 
Figure 2
 
Average proportion correct for saccadic (A) and perceptual (B) decisions for different conditions of feedback. Performance was generally better with more informative feedback, although all conditions showed improvement. (Note that percent correct for 1st saccades are low due to the naturally high false alarm rate in target-absent trials.) SEMs represent between subject differences.
Figure 2
 
Average proportion correct for saccadic (A) and perceptual (B) decisions for different conditions of feedback. Performance was generally better with more informative feedback, although all conditions showed improvement. (Note that percent correct for 1st saccades are low due to the naturally high false alarm rate in target-absent trials.) SEMs represent between subject differences.
Two results are apparent. First, there are differences in percent correct between the three feedback conditions across the entire duration of the experiment for both saccadic decisions and perceptual decisions, with higher levels of performance from conditions with more information during feedback (saccadic: Unsupervised: 14.7%; Reinforcement: 17.1%; Supervised: 20.4%; ANOVA F(2,60) = 9.74, p = 2.26 * 10 −4; perceptual: Unsupervised: 61.1%; Reinforcement: 65.7%; Supervised: 65.7%; ANOVA F(2,60) = 4.05, p = 0.02). The second observation is that there is an increase in performance as the experiment progressed. This linear trend was statistically significant or nearly significant for both saccadic and perceptual decisions in each of the three feedback conditions (saccadic: Unsupervised: F(1,20) = 2.51, p = 0.08; Reinforcement: F(1,20) = 12.22, p < 0.01, Supervised: F(1,20) = 6.16, p < 0.05; perceptual: Unsupervised: F(1,20) = 6.17, p < 0.05; Reinforcement: F(1,20) = 10.36, p < 0.01, Supervised: F(1,20) = 5.89, p < 0.05). Although the improved performance might be related to the learning of the statistical structure of cue validity, other possibilities exist including a general reduction in internal noise and or a gain change in the processing of contrast information at each location (Dosher & Lu, 1998). 
Measures of efficiency for supervised feedback
An ideal observer model was used to evaluate human efficiency in the task (see Methods and 1). Efficiency was evaluated with an ideal observer using each of three different sources of information for cue validity: Naive (equal priors), ideal learning with supervised feedback (see 1), and correctly informed (actual priors). Figure 3 plots the efficiency of human observers as compared to each of the three ideal observer models. Note that because human observers learned, to some degree, the values of cue validity, this confers them a relative advantage in comparison to the naive ideal observer which is not capable of exploiting this statistical structure. However, because the ideal observers with knowledge of cue validity are more capable of utilizing this information, human efficiency is reduced when compared to these models. Thus, human efficiency is greater when compared to an ideal observer with equal priors (naive ideal observer), than when compared to an ideal observer fully informed of the values of cue validity. Because these represent the extreme bounds of efficiency for naive or fully informed observers, ideal observers using learning algorithms to acquire values of cue validity will necessarily fall between these two measures. As Figure 3 shows, the ideal observer learning with supervised feedback quickly asymptotes towards the informed ideal observer performance. This reveals the fact that the ideal observer learns the values of cue validity very rapidly. By the second block of trials, performance between the ideal observer learning cue validity and the ideal observer with correctly informed values of cue validity are nearly indistinguishable. Because the ideal observer learns cue validity so rapidly, more quickly than human observers, efficiency drops early in the experiment (third trial block), but increases later in the experiment, revealing that humans learn cue validity more slowly. 
Figure 3
 
Measure of human efficiency as compared the ideal observer using three different techniques of acquiring values of cue validity. Efficiency measures for the ideal observer when learning, or when informed of, cue validity are very similar, revealing the rapid and accurate estimate of cue validity for the ideal observer learning algorithm with supervised feedback.
Figure 3
 
Measure of human efficiency as compared the ideal observer using three different techniques of acquiring values of cue validity. Efficiency measures for the ideal observer when learning, or when informed of, cue validity are very similar, revealing the rapid and accurate estimate of cue validity for the ideal observer learning algorithm with supervised feedback.
The overall increase in percent correct ( Figure 2), and efficiency ( Figure 3), suggests that the human observers are improving at the task. To further examine the source of performance improvement, we examined the frequency of each trial outcome as well as choice behavior for each cue type. If performance improvement is entirely due to sources other than learning the statistical structure of the validity of the cues, then distribution of behavioral choices across cues in target absent trials should remain constant throughout the duration of the experiment. 
Response categories
For both saccadic and perceptual decisions, each trial was categorized as having one of five possible trial outcomes. Trials in which the observer correctly localized the position of a target on a target-present trial were categorized as hits. Trials in which the observer reported “No Target” when a target had in fact been present were misses. A target-absent trial could be categorized as either a false alarm or a correct rejection, depending on whether the observer incorrectly selected a cue, or had correctly reported “No Target”, respectively. The fifth category was mislocalizations, trials in which the target was present in a cue, but the observer localized the target elsewhere. Figure 4 plots these measures of performance for saccadic ( Figure 4A) and perceptual ( Figure 4B) decisions in each of the three feedback conditions. 
Figure 4
 
General performance measures throughout the experiment. (Note that measures of saccadic performance maintain the arbitrary naming convention as used in the perceptual task of localizing or rejecting the target. However, subjects were not given any instruction on the use of eye movements to complete the perceptual task. See text for details.)
Figure 4
 
General performance measures throughout the experiment. (Note that measures of saccadic performance maintain the arbitrary naming convention as used in the perceptual task of localizing or rejecting the target. However, subjects were not given any instruction on the use of eye movements to complete the perceptual task. See text for details.)
Saccadic decisions
As discussed above, subjects were not made aware that their performance for first fixations would be evaluated in a manner similar to their perceptual decisions at the end of each trial. Thus, there was no a priori way in which subjects could unambiguously report “No Target” through gaze alone. However, as a convention, we considered first fixations that were not directed to any cue, but stayed within the 70 pixel (2.87°) radius of center, as a “No Target” response. Thus, the low rate of correct rejections for saccades (<10%) reflects the infrequent trials in which the first fixation on a target-absent trial remained within this central window. And, because eye movements were nearly always directed towards cues, outside of this central window, the false alarm rate for saccadic decisions is quite high (>90%), and the rate of misses is quite low (<10%), regardless of feedback condition ( Figure 4A). 
A more informative measure of saccadic accuracy is the hit rate, reflecting correct localization of the target. Saccadic decisions had significantly different hit rates between feedback condition, with higher rates in conditions with more informative feedback (ANOVA, F(2,62) = 12.65, p = 2.61 * 10 −5; Unsupervised: 19.9%, Reinforcement: 24.8%, Supervised: 33.9%). This performance difference suggests that observers were using information in the informative feedback conditions to guide their decisions in detecting and localizing the target using eye movements. 
Perceptual decisions
Subjects' performance in perceptual decisions is shown in Figure 4B. There are two notable features. First, the rate of hits and false alarms throughout the entire experiment were similar in all three feedback conditions and were not significantly different (Hits: ANOVA, F(2,62) = 0.43, p = 0.65; Unsupervised: 64.4%, Reinforcement: 66.1%, Supervised: 66.9%; False Alarms: ANOVA, F(2,62) = 1.91, p = 0.16; Unsupervised: 42.1%, Reinforcement: 35.1%, Supervised: 33.5%). This suggests that the general performance difference between conditions ( Figure 2) may not have been appreciably influenced by differences in hits and false alarms throughout the entire experiment. 
Another test of performance improvement is to incorporate measures of hits and false alarms into a single metric, captured by each subject's d′. 1 Subjects with more informative feedback showed greater sensitivity to the target in perceptual decisions, and there was a significant difference between feedback conditions (ANOVA, F(2,60) = 4.28, p = 0.02; Unsupervised 0.60, Reinforcement 0.86, Supervised 0.89). Differences in d′ may have reached significance, while hits and false alarms did not, because of the combination of two measures that, by themselves, only approach statistical significance. 
However, despite similar rates of false alarms measured across all 300 trials, there are clear differences between trial blocks in the type of responses during target-absent trials. In the unsupervised and reinforcement conditions, observers had an initially higher rate of false alarms and lower rate of correct rejections than observers receiving supervised feedback. This suggests that observers not receiving supervised feedback had a bias to select cues, and thus report “target present”, on target-absent trials. This may suggest that the “No Target” feedback provided in the supervised condition allows subjects to more accurately assess the 50% likelihood that trials in which a target had been present (or absent). While subjects were told of this likelihood prior to the beginning of the experiment, the additional feedback may have strengthened this estimate, and undermined any intrinsic bias to select a cue rather than to report “no target.” 
The second notable feature regarding subjects' perceptual decisions is that there were significant differences in the rate of mislocalizations ( F(2,62) = 4.61, p = 0.01), with fewest errors occurring when observers were provided with supervised feedback (Unsupervised: 12.1%, Reinforcement: 10.3%, Supervised: 8.0%). This suggests that on error trials, when subjects were informed where the target had appeared during the test stimulus, they were capable of using this information in future trials to improve localization performance. 
The above performance differences, in both saccadic and perceptual decisions, between feedback conditions, suggest that the quality of information provided during feedback influences observers' ability to discriminate and localize the presence of a target. To examine whether these differences in performance accompanied learning of cue validity, we next examined the frequency of hit and false alarm outcomes for each cue type. 
Cue-specific performance
Hit rate by cue
To seek a possible cause underlying the higher hit rate for saccadic decisions and the overall improved performance in perceptual decisions when observers received informative feedback, we further examined the frequency of correctly localizing targets within each cue. Figure 5 plots the hit rate for each cue in each feedback condition throughout the experiment. Performance is averaged across cues with identical validity. Throughout the entire experimental session of 300 trials, observers with reinforcement or supervised feedback had significantly different hit rates for each of the cues in which a target could appear ( P = 0.6, 0.2, 0.1, 0.1). This was true for both saccadic and perceptual decisions (Reinforcement: saccadic: F(2,40) = 4.35, p = 0.02, perceptual: F(2,40) = 7.02, p = 0.002; Supervised: saccadic: F(2,40) = 3.88, p = 0.03, perceptual: F(2,40) = 8.88, p = 6.45 * 10 −4). However, observers with no feedback, in the unsupervised condition, showed no difference in the hit rate among the cues for either saccades or perceptual decisions (saccadic: F(2,40) = 1.58, p = 0.23, perceptual: F(2,40) = 2.31, p = 0.11). 
Figure 5
 
Hit rate for saccades (A) and behavioral decisions (B) for each cue. Conditions with more informative feedback had a greater number of hits on probable target cues. Performance is averaged across cues with identical validity.
Figure 5
 
Hit rate for saccades (A) and behavioral decisions (B) for each cue. Conditions with more informative feedback had a greater number of hits on probable target cues. Performance is averaged across cues with identical validity.
Similar hit rates between cues in the unsupervised condition suggests that the internal signal generated from response selection at the end of each trial, in addition to the exposure to the statistics of cue validity during the test stimulus, is not sufficient for observers to learn which cues are likely to have the target. However, the higher hit rates when the target appeared at cues with high predictive prior probabilities (i.e., cue effect) in reinforcement and supervised conditions suggest that observers receiving informative feedback are capable of learning which cues are more likely to contain the target, and to adjust their decision strategies accordingly. While the cue effect is strongest in the supervised condition, the ability for subjects to learn cue validity using only reinforcement suggests that completely informative feedback is not critical to learning cue validity in this search task. 
To test whether observers learned cue validity across successive trial blocks, we tested each condition group for an interaction between cue type and trial block. Neither saccadic or perceptual decisions had any significant interaction between cue type and trial block for any of the three feedback conditions (2-way ANOVAs, all F(4,8) < 1.20, p > 0.24). Thus, despite the differences in the frequency of hits between each cue in the reinforcement and supervised conditions (see above), these differences appear to have established themselves early in the experiment and were relatively consistent for the remaining trials. 
False alarm by cue
The differences in hit rate across cues might be associated to a biasing or increased weighting towards sensory evidence from the highly predictable cues (Eckstein et al., 2002), but may also be explained by an improvement in the processing of sensory information at such cues such as a reduction in internal noise (Dosher & Lu, 1998), or both. To disambiguate these two explanations, we next sought to determine if differences in general performance might accompany a difference in the rate of selection across cue types in target-absent trials. If observers are learning the statistics of cue validity on target-present trials, this learning may manifest itself in decisions made during target-absent trials. A notable feature of target-absent trials is that each of the six distractor stimuli, on average, has the same contrast value. Thus, any differences in the frequency of mistakenly selecting among these cues presumably reflect internal differences, such as the observers' learned expectation where to place more weight for their decision (Eckstein et al., 2002, 2006; Eckstein, Pham, et al., 2004; Kersten, Mamassian, & Yuille, 2004). 2 
Potential differences in false alarm rates between the cues may also reflect implicit learning mechanisms. Although the perceptual decision to locate the target is an explicit judgment, the frequency with which the cues are selected may be considered an implicit measure, because a biased choice pattern does not require declarative knowledge of the scene statistics (Bechara, Damasio, Tranel, & Damasio, 1997; Chun & Jiang, 1998; Schacter, 1987). 
Figure 6 plots the false alarm rate for each of the four cue types for both saccadic and perceptual decisions, averaged across cues with identical validity. In the Unsupervised condition, when subjects did not receive feedback, the rate of false alarms for both saccadic and perceptual decisions were similar across each cue type (saccadic: F(3,60) = 0.25, p > 0.87; perceptual: F(3,60) = 0.89, p > 0.45). These unbiased distributions of saccadic and perceptual decisions during target-absent trials suggests that subjects were not learning cue validity, or were not applying this learning to bias their decisions. However, when feedback was provided, either as reinforcement for correct decisions, or with complete supervised feedback, the results were very different. In these informative feedback conditions, the frequency of subjects' saccadic and perceptual false alarms were different across cues, with more frequent selection of probable cues (Reinforcement: saccadic: F(3,60) = 5.45, p = 0.002; perceptual: F(3,60) = 6.91, p4.98 * 10 −4; Supervised: saccadic: F(3,60) = 7.57, p = 2.25 * 10 −4; perceptual: F(3,60) = 11.24, p5.96 * 10 −6). 
Figure 6
 
False alarm rate for saccades (A) and behavioral decisions (B) for each cue. Note increased cueing effect in conditions with more informative feedback. Performance is averaged across cues with identical validity.
Figure 6
 
False alarm rate for saccades (A) and behavioral decisions (B) for each cue. Note increased cueing effect in conditions with more informative feedback. Performance is averaged across cues with identical validity.
Similar to the above analysis using hit rate, we also assessed the time course of learning of cue validity by testing for an interaction between cue type and trial block. In contrast to the lack of significant interaction with hit rate measures, the false alarm rate did reveal significant interactions between cue type and trial block. Supervised feedback had a significant interaction for both saccadic and perceptual decisions (saccadic: F(4,12) = 2.93, p = 8.03 * 10 −4; perceptual: F(3,60) = 11.19, p6.28 * 10 −6), and reinforcement was significant only for perceptual decisions ( F(4,12) = 2.96, p7.36 * 10 −4). No interactions were found for the unsupervised condition, for either decision measure ( F(4,12) < 1, p > 0.66). Whereas the lack of interaction for hit rate suggested that learning is complete relatively early in the experiment, the interaction of false alarms by cue and trial block suggest that learning may continue throughout the entire experiment. 
Start time of first fixation
When the search task includes informative feedback, observers are more likely to direct their first saccades to cues that have a higher likelihood of containing the target ( Figure 6). A possible advantage of this cue effect is the accompanying increase in the hit rate for saccadic and perceptual decisions ( Figure 5). This suggests that observers are using information on cue color to plan the direction of their first saccade, and that this top-down strategy affords subjects with the observed increase in task performance. However, employing this top-down strategy may be at a cost. Because the orientation of the cue array was randomly aligned on each trial, as well as the position of each cue within the array, subjects could have no accurate a priori estimate of where each cue would be located before the onset of the test stimulus. The biased distribution of first fixations could only have occurred had subjects planned their saccades using the color information following test onset. Using cue information to direct first saccades would require top-down computations, including localizing the spatial position of a circle whose color corresponded to the cue in which the observer had the greatest likely estimate of target presence. Thus, observers may have been making a trade-off between the possible performance advantages afforded by directing first saccades to high validity cues, against a temporal delay in making this decision. 
Measures of temporal processes involved in saccadic planning often include the reaction time required to initiate a saccade. While our data collection set did not include the start time for saccades, it did include the start time for first fixations. Because the flight time can reasonably be assumed to be similar across cues and conditions, a comparison of first fixation start times is similar to a comparison to the start times of first saccades. Thus, to investigate any temporal cost in planning eye movements using top-down color information, we calculated the average start time for first fixations to cues ( Figure 7). 
Figure 7
 
Average start time for first fixation to a cue. Longer reaction time for the first saccade in reinforcement and supervised learning conditions may reflect planning saccadic decisions reflecting cue validity. The increased delay for saccadic decisions in supervised and reinforcement learning conditions parallels the cue effect found in those conditions.
Figure 7
 
Average start time for first fixation to a cue. Longer reaction time for the first saccade in reinforcement and supervised learning conditions may reflect planning saccadic decisions reflecting cue validity. The increased delay for saccadic decisions in supervised and reinforcement learning conditions parallels the cue effect found in those conditions.
Two features of the data are clear. First, the average start time for first fixations is similar for each condition in the first block of sixty trials. Second, is that by the final block of sixty trials, fixations in the reinforcement and supervised conditions start significantly later than fixations in the unsupervised condition ( t-test, Supervised vs. Unsupervised: F(1,40) = 2.34, p < 0.05; Reinforcement vs. Unsupervised: F(1,40) = 2.27, p < 0.05). This suggests that the cue effect for saccadic decisions observed in each of these conditions may entail a temporal cost, used to localize a cue in which the observer has a prior expectation of target presence. The earlier start time of the first fixation in the unsupervised condition is what would be expected if subjects were not using cue color to direct their first saccade, as suggested by the unbiased distribution in saccadic decisions, shown in Figure 6A. Thus, by the final block of 60 trials, the cue effect for saccadic decisions in conditions with informative feedback may require calculations requiring approximately 65 ms to compute, echoing similar estimates for target discrimination in frontal eye fields (Thompson, Hanes, Bichot, & Schall, 1996). 
Gaze direction throughout entire trial
As mentioned above, because subjects were not instructed to use shifts in gaze to indicate their response, and multiple fixations could occur within the 2000 ms test stimulus, subjects' first fixations may only begin to indicate how observers applied knowledge of cue validity to direct gaze. Thus, we also analyzed subjects' distribution of gaze during the entire 2000 ms of the test stimulus ( Figure 8). Similar to the analysis for first fixations, we performed this analysis only on target-absent trials, with the presumption that any differences in gaze behavior among these cues would reflect internal estimates of the most likely target location. 
Figure 8
 
Average number of fixations (A) and average total fixation duration (B) during the entire 2000 ms test stimulus.
Figure 8
 
Average number of fixations (A) and average total fixation duration (B) during the entire 2000 ms test stimulus.
Gaze distribution throughout the test stimulus was similar, but slightly different, to the distribution of first saccades ( Figure 6). Similar to first saccades, there was a significant effect of cue for both saccadic and perceptual decisions conditions with informative feedback (ANOVA, Reinforcement: num. fixs F(3,60) = 8.97, p5.38 * 10 −5, fix. duration F(3,60) = 9.41, p = 3.44 * 10 −5; Supervised: num. fixs F(3,60) = 13.72, p = 6.39 * 10 −7, fix. duration F(3,60) = 13.67, p6.70 * 10 −7). However, unlike first saccades, there was also a cue effect in the unsupervised condition, that either approached, or just barely, reached levels of statistical significance (ANOVA, num. fixs F(3,60) = 2.05, p = 0.12; fix. duration F(3.16), p = 0.03). The emergence of a cue effect when considering the entire test duration suggests that first saccades in the unsupervised condition are not guided by cue validity, whereas saccades later in the trial are directed more strategically, incorporating top-down knowledge. This interpretation is consistent with differences in the start time of first fixations, addressed above. 
It is also interesting to consider that, in the unsupervised condition, even if saccades are eventually directed to probable cues later during the test stimulus, this knowledge does not appear to translate into a similar distribution of observers' perceptual decision on cue selection during target-absent trials ( Figure 6B). 
Explicit measure of learning
Thus far, the pattern of results suggests that reinforcement and supervised feedback resulted in much greater sensitivity to statistics of cue validity than did unsupervised feedback. This was evidenced in subjects' distribution of saccadic and perceptual decisions, average start time for first fixations, and the distribution of gaze throughout the entire duration of the test stimulus. It is possible that these effects are derived from implicit knowledge of cue validity, without having explicit awareness of having learned this information (Chun & Jiang, 1998). As mentioned above, the distribution with which observers selected each cue does not require explicit declarative knowledge of cue validity, and thus, the analysis of gaze and choice behavior thus far can all be considered implicit measures of learning (Bechara et al., 1997; Chun & Jiang, 1998; Schacter, 1987). 
To assess explicit learning of cue validity at the end of each experiment, we asked subjects to estimate the probability with which each cue would be expected to contain the target on target-present trials, with the constraint that these probabilities should add up to one (see Methods). Figure 9 plots these explicit estimates of cue validity, averaged across subjects. 
Figure 9
 
Average explicit estimates of cue validity. (Actual cue colors were counterbalanced across subjects.) Despite the fact that subjects' saccadic and perceptual decisions did not reflect differences in cue validity in the unsupervised condition, subjects nevertheless were able to report reasonably accurate estimates of cue validity and were similar to those in conditions with supervised or reinforcement feedback.
Figure 9
 
Average explicit estimates of cue validity. (Actual cue colors were counterbalanced across subjects.) Despite the fact that subjects' saccadic and perceptual decisions did not reflect differences in cue validity in the unsupervised condition, subjects nevertheless were able to report reasonably accurate estimates of cue validity and were similar to those in conditions with supervised or reinforcement feedback.
First, note that while subjects' estimated values were statistically different from the true values of cue validity in each feedback condition (Hotelling 1-sample Chi squared test, p < 0.01 for each condition (Mardia, Kent, & Bibby, 1979)), the subjective ranking of each cue was roughly similar to the actual values used in the experiment. The underestimation of probable cues and overestimation of improbable cues is consistent with previous assessments of learning and frequency estimates (Tversky & Kahneman, 1992). However, the overall shape of this distribution is strongly suggestive that observers acquired explicit knowledge of cue validity. 
Second, and most striking, is that subjects had similar estimates for cue validity, regardless of feedback condition (Hotelling multivariate 2-sample T-squared test, Unsupervised vs. Reinforcement: p = 0.59; Unsupervised vs. Supervised: p = 0.11; Reinforcement vs. Supervised: p = 0.49, (Mardia et al., 1979).3 Similar explicit estimates for cue validity are surprising, considering the large differences in implicit behaviors during each feedback condition. In the Discussion, we address the possible relationship between how subjects acquired and used implicit and explicit knowledge of cue validity during this task. 
Discussion
Performance in the present visual search task reveals that observers are capable of learning the statistic of cue validity within a single experimental session. This learning was evidenced by the distribution of saccadic and perceptual decisions, increasingly directed toward valid cues, as well as observers' final explicit estimates of cue validity. Learning cue validity is facilitated with increasing levels of feedback information, suggesting that mechanisms monitoring scene statistics are sensitive to internal measures of task performance. Results from this experiment reveal insights into mechanisms employed in visual search, learning of scene statistics, the relationship between saccadic and perceptual decisions, the influence of task feedback, and the association between implicit and explicit learning. 
Mechanisms mediating performance improvement during visual search
The improvement in general performance throughout the experimental session ( Figures 2 and 3) may have been aided by one of at least three candidate mechanisms. The first possibility is that improvement was due to a location independent reduction in additive internal noise, or increase in pre-internal noise sensory gain, 4 irrespective of where the stimuli are positioned within the scene (Dosher & Lu, 2000a, 2000b; Lu & Dosher, 1998, 1999). However, while this reduction in additive internal noise might be able to account for improved performance in general measures of percent correct, it is unable to account for differential performance at different locations (Figures 5 and 6), or for increased performance when targets appear in familiar contexts. 
A second possible mechanism to account for improved performance is that observers distribute a finite supply of attention among regions of the scene as if it were a limited resource. Visual processing could thus be capacity-limited, where the role of visual attention is to allocate these limited perceptual resources to the attended location, and this would result in an enhancement in the quality of sensory processing (Luck, Hillyard, Mouloua, & Hawkins, 1996), akin to a “mental spotlight” (Cave & Bichot, 1999). However, human performance is often in contrast to these model predictions. For example, strict interpretation of this model would suggest that as observers learn which cue is most valid, their limited perceptual resources are deployed to this location, allowing for more accurate discrimination, including improved ability to reject the presence of a target at the attended location. This would still result in improved performance for both saccadic and perceptual decisions on target-present trials when the cue was valid, due to the enhanced discriminability and detection of the target. However, the increased perceptual sensitivity afforded by the capacity-limited model would also predict a decrease in saccadic and perceptual decisions for highly valid cues on target-absent trials (Eckstein, Abbey, et al., 2004; Eckstein et al., 2002). This is because the increased sensitivity to the target at the cued location would necessarily result in a decrease in the false alarms for localizing the target at that position. This predicted effect is contrary to the biases in selections observed in the present study ( Figure 6). 
A third possible mechanism to explain the observed cueing effect is differential-weighting, where the allocation of visual attention can be described as the distribution of weights assigned to the sensory evidence for a state of the world (Dayan & Zemel, 1999; Eckstein, Abbey, et al., 2004; Eckstein, Pham, et al., 2004; Eckstein et al., 2002; Kinchla, Chen, & Evert, 1995; Shaw, 1982; Shimozaki, Eckstein, & Abbey, 2003). In this framework, visual information is encoded with equal quality, saliency, or signal-to-noise ratio, across the possible target locations, but information arising from more probable locations is given more weight towards the decision of detecting or rejecting the target. Thus, this weighting model incorporates both the raw sensory information in the image (e.g. contrast value), as well as internal estimates of the likelihood of this sensory information occurring, given an observer's prior knowledge of the probability that the state of the world (e.g. target presence) would give rise to this stimulus. A differential-weighting model captures psychophysical demonstrations of cueing (Eckstein, Pham, et al., 2004; Eckstein et al., 2002), context effect of saccadic deployment in natural scenes (Eckstein et al., 2006; Torralba et al., 2006) as well as neurophysiology data (Eckstein, Liston, & Krauzlis, 2007), and quantitative models of eye movements (Carpenter & Williams, 1995). Differential weighting is also compatible with theories of selective attention that consider the computational challenge of monitoring uncertainty in dynamic environments and the role of prediction in guiding perceptual decisions (Dayan, Kakade, & Montague, 2000). Observers' distribution of saccadic and perceptual decisions in the present task is consistent with this model: cues more likely to contain the target were increasingly likely to be selected ( Figure 6). Thus, this cue effect is likely a reflection of a prior expectation of target presence, influencing the value of an internal decision variable used for saccades and perceptual decisions. We next explore how these expectations may be learned. 
Exploiting scene statistics to improve search performance
Several experiments have already demonstrated that the visual information detected in a scene, or where subjects choose to attend, depends on their estimates of where targets or objects are likely to be located. In experiments that employ artificial stimuli, subjects' knowledge of scene statistics is often derived from experimenters' explicit instructions, and the benefits of using this information in improving task performance is implied (Eckstein, Pham, et al., 2004; Eckstein et al., 2002; Palmer et al., 1993; Posner, 1980; Shaw & Shaw, 1977). The utility of exploiting scene structure has also been demonstrated using more ecologically relevant stimuli, such as images of natural scenes (Biederman, Mezzanotte, & Rabinowitz, 1982; Biederman, Teitelbaum, & Mezzanotte, 1983; Chen & Zelinsky, 2006; Eckstein et al., 2006; Hidalgo-Sotelo, Oliva, & Torralba, 2005; Hollingworth & Henderson, 1998; Torralba et al., 2006). However, because observers have had a lifetime of exposure to similar scenes or objects, the time course required for learning statistical structure, or the mechanisms involved, are not clear. 
Learning effects are more directly observable under conditions using novel stimuli with which the observer has no previous exposure and are not informed of the underlying statistical structure. Performance improvements throughout the course of a single experimental session have been shown when this structure includes the probable spatial position of a target (Geng & Behrmann, 2005), or repeated spatial configurations (Chun & Jiang, 1998), and these performance improvements may be facilitated by biases in the distribution of gaze (Peterson & Kramer, 2001; Walthew & Gilchrist, 2006). Observers are also capable of learning the probability that certain objects may undergo a change, and distribute their saccadic and perceptual decisions towards objects likely to be the target for a correct response (Droll et al., 2007). 
In the Droll et al. (2007) study, observers' bias in behavior emerged despite not alerting participants to the presence of an underlying statistical structure. In the present study, subjects were informed that some cues were more likely to contain the target than others, but were not informed about the possible values or distribution of cue validity and thus had to learn this structure on their own. Because the cue effect continued to develop throughout all 300 trials, and was not observed in all feedback conditions, informing observers of the existence of this structure is unlikely to have significantly influenced performance. It is possible that the statistic of target-cue co-occurrence may be a variable that observers are intrinsically sensitive to monitor and exploit (Michel & Jacobs, 2007), regardless of verbal instruction. 
However, measures of efficiency ( Figure 3) suggest that the ability for human observers to learn and exploit cue validity is suboptimal. The ideal observer using supervised feedback nearly completely learned the values of cue validity by the second block of trials, and the steady, or decreased, measure of efficiency in the first half of the experiment is a consequence of relatively impaired, or slowed, learning in the human observers ( Figure 3) (Eckstein, Abbey, et al., 2004). Yet as the experiment progressed, subjects continued learning cue validity, evidenced in an increase in efficiency in the later half of the experiment, as well as the increase in false alarm trials towards probable cues ( Figure 6). 
The use of scene statistics in visual search is not without cost. Subjects who received informative feedback in the reinforcement and supervised feedback conditions had increasingly longer start times, which accompanied their saccadic cue effect ( Figure 7). This temporal cost is likely to reflect processing stages of cue recognition and guidance of attention, reflected in both psychophysical (Peterson & Kramer, 2001) and physiological (Thompson, Bichot, & Sato, 2005; Thompson et al., 1996) measures. 
Internal information used in saccadic and perceptual decisions
In conditions with informative feedback, a cue effect was observed in both saccadic and perceptual decisions ( Figure 6). This is consistent with neurophysiology and psychophysical experiments that suggest that similar internal representations contribute to saccadic and perceptual decisions. Cortical networks serving eye movements and visual attention share much overlap (Corbetta, 1998; Corbetta et al., 1998). The direction of covert visual spatial attention is tightly yoked to the location the observer is next planning to fixate (Kowler, Anderson, Dosher, & Blaser, 1995) and search saccades are driven by neural mechanisms that represent properties relevant to the perceptual task (Beutter, Eckstein, & Stone, 2003; Eckstein, Beutter, Pham, Shimozaki, & Stone, 2007; Mazer & Gallant, 2003; Rajashekar et al., 2006). 
If shared internal representations for saccadic and perceptual decisions were responsible for the presently observed bias in gaze, this would be particularly interesting because the feedback at the end of each trial specifically pertained only to observers' perceptual decision. In principle, it would have been possible for observers to employ a variety of idiosyncratic alternative gaze strategies during the long 2000 ms test stimulus that would still allow significant sampling, yet not have mirrored the distribution of perceptual decisions (e.g. always scanning clockwise). The present results suggest that cortical areas controlling gaze behavior may be sensitive to feedback not related to the saccadic decision itself, but by information relating to the relevant perceptual decision. 
This is an important observation for the following reason. As mentioned in the introduction, several cortical areas controlling eye movements have been shown to exhibit sensitivity to reward related variables and task structure, guided by performance feedback (Ding & Hikosaka, 2006; Glimcher & Rustichini, 2004; Hikosaka et al., 2000; Ikeda & Hikosaka, 2003; McCoy & Platt, 2005; Platt, 2002; Platt & Glimcher, 1999). It has been suggested that this sensitivity is capable of guiding learning during extended tasks typical of ordinary human behavior (Hayhoe & Ballard, 2005). An unresolved issue in this field is that this cortical sensitivity is demonstrated in conditions where feedback directly relates to saccadic decisions, yet eye movements themselves rarely result in a direct reward during ordinary behavior (Ipata, Gee, Goldberg, & Bisley, 2006). The present results demonstrate that feedback for perceptual decisions may sufficiently serve as a learning signal for saccades, thereby obviating the need for observers to receive feedback or reward related to each eye movement that facilitates final perceptual decisions. 
The role of feedback
Our main finding is that observers exhibit a greater cue effect in both saccadic and perceptual decisions when informative feedback is provided. In the unsupervised condition when no feedback information was provided, little or no cue effect was observed. The absence of a significant cue effect without feedback suggests that the internal signals generated from response selection alone, or the mere exposure to visual scenes, may be insufficient for learning cue validity. When reinforcement or supervised feedback was provided, observers increasingly biased their decisions towards high validity cues, without having been initially instructed on the values or distribution of cue validity. 
Previous visual tasks have demonstrated learning using supervised feedback (Droll, Gigone, & Hayhoe, 2007), reinforcement (Seitz, Nanez, Holloway, Tsushima, & Watanabe, 2006), or only passive exposure (Fiser & Aslin, 2001). The learning of scene statistics may also be facilitated by visual attention, suggesting that learning scene structure is sensitive to top-down signals and attentional control (Jiang & Leung, 2005). Because of the different stimuli and task paradigms in each of these experiments, generalizing how feedback, or other task elements, contributes to performance improvement is not always clear (Fine & Jacobs, 2002). In this experiment, the stimuli and task were constant across all conditions, and only differed in the feedback. How might differences in feedback account for the observed differences in performance? 
The cue effect was strongest for supervised feedback, and slightly subdued using only reinforcement. There are at least two possibilities to account for these differences. One possibility is that the smaller cue effect using reinforcement reflects observers' selection strategies that allow for exploration. Exploring cues with uncertain validity instead of exploiting learned information is a central feature of many reinforcement learning algorithms. Such exploration may allow observers to be sensitive to shifts in task structure or scene statistics (Daw et al., 2006; Yu & Dayan, 2005). However, it is not clear how applying an exploratory strategy would be useful in the unsupervised condition, where investigative choice behavior would provide no opportunity to reveal new information. Thus, while exploratory decision strategies may be helpful in real-world situations, exploration cannot solely account for the observed differences in saccades or perceptual choices between feedback conditions in this particular task. 
The second possibility to account for differences in cue effect across feedback conditions is that observers are simply uncertain what cue information to update, and thus do not improve their estimates of cue validity. This is in contrast to the supervised condition, where incorrect responses are followed by instructions on what the correct answer would have been for that trial. Subjects may be capable of using this supplementary information to update their priors, and thus more quickly learn and distribute their gaze and perceptual decisions in a manner in accordance with the values of cue validity. This interpretation is consistent with human performance in a rapid perceptual learning paradigm, where observers update their priors following a correct response, but show no performance benefit following incorrect feedback (Abbey, Pham, Shimozaki, & Eckstein, 2008; Eckstein, Abbey, et al., 2004). 
While the cue effect was consistently strongest following supervised feedback, it is also important to note that there was also significant performance improvement when subjects were only given reinforcement. In the final half of the experiment, overall percent correct for perceptual decisions was nearly identical with supervised or reinforcement feedback, and both were higher than unsupervised feedback ( Figure 2B). This is important because it shows that for a visual search task, reinforcement may be sufficient for learning to transpire, and supervised learning is not always necessary (Droll et al., 2007). As exploratory actions in a new task or environment are usually rewarded only following success, and supervised learning may not be available, it is often critical that reinforcement be sufficient for learning to occur (Sprague et al., in press). 
The central aim behind the design of the present study was to identify sources of information contributing to learning mechanisms during visual search. As mentioned in the introduction, it is not always clear how feedback information drives learning across different search tasks. When searching among stimuli with a high signal-to-noise ratio, and error rates are low, observers are likely aware when their saccadic or perceptual decision correctly localized the target of interest and the delivery of overt feedback is redundant (e.g. searching for a “T” among “Ls”). Under those circumstances, it is uncertain if the learning of scene statistics is derived from the passive exposure to scene information, or the interactive reinforcement and error signals manifest between observers' actions and task relevant visual information. In the present experiment, the same set of stimuli was used for each feedback condition, and exposure to this same set of stimuli resulted in different patterns of learning, depending on feedback. Due to contrast noise, performance included frequent errors. This allowed feedback to serve as a non-redundant and informative learning signal, not present in the visual scenes alone. During tasks when feedback does not provide this additional source of information, differences in learning may not be made apparent. 
Implicit versus explicit knowledge
When feedback was not provided in the unsupervised condition, subjects showed no learning of cue validity in their distribution of first saccades ( Figure 6A), and very little learning in their distribution of gaze throughout the entire trial ( Figure 8) or perceptual decisions ( Figure 6B). Yet, from these same subjects, explicit estimates of cue validity were statistically indistinguishable from conditions in which more informative feedback was provided and implicit learning was clearly demonstrated ( Figure 9). 
Psychology literature is rife with attempts to distinguish between implicit and explicit learning or memory. Schacter (1987) suggested: “Implicit memory is revealed when previous experiences facilitate performance on a task that does not require conscious or intentional recollection of those experiences; explicit memory is revealed when performance on a task requires conscious recollection of previous experiences.” In accordance, measures of implicit knowledge have included fixation duration or saccadic targeting (Hollingworth & Henderson, 2002), and measures of explicit knowledge include declarative report. It is not clear how these separate knowledge structures may influence each other; whether implicit knowledge accumulates through repeated exposure until it reaches a threshold of explicit awareness (Cleeremans & Jimenez, 2002), or how explicit knowledge may influence implicit measures (Jiménez, Vaquero, & Lupiáñez, 2006). Guiding visual decisions through implicit, but not explicit, knowledge of scene statistics is considered advantageous because implicit representations have been characterized as having a larger capacity, and more robust to decay or interference (Lewicki, Hill, & Czyzewska, 1992). Relying on these implicit representations would also presumably allow more processing resources to be devoted to knowledge structures accessible to explicit knowledge (Chun & Nakayama, 2000). 
The pattern of results in the present experiment is suggestive of an alternative interpretation. It is possible that as the experiment progressed, subjects in each of the three feedback conditions acquired explicit knowledge of cue validity, while only those subjects receiving informative feedback acquired implicit knowledge. This interpretation would be in contrast to other suggestions that an accumulation of implicit knowledge is required before forming into explicit representations (Chun & Jiang, 1998). Conversely, the formation of explicit knowledge in the unsupervised condition may not necessarily have access to the control of implicit decisions, or may result in a wide range of behavior, thus not always translating into advantageous decisions (Bechara et al., 1997; Jimenez et al., 2006). 
However, it is also important to note that the dissociation between implicit and explicit behavior was not complete; gaze throughout the entire trial and final perceptual decisions had a modest bias towards valid cues in the later half of the experiment. While this modest bias was less dramatic than the learning suggested in explicit estimates, it is not clear how differences in implicit behaviors translate into differences in explicit knowledge. Thus, while the different learning patterns between implicit behavior and explicit estimates are suggestive of separate learning mechanisms, this dissociation was not complete. 
Conclusions
Our results demonstrate that human observers can quickly learn statistics of cue validity, and that this learning guides saccadic and perceptual decisions performed in environments with uncertain scene structure. Mere exposure to statistical information, or internal signals generated from response selection, is not sufficient for learning to occur; reinforcement or supervision is critical. However, reinforcement signals alone may be sufficient to drive learning, without the need for additional supervised feedback. Reinforcement signals obtained through perceptual decisions are also capable of influencing saccadic decisions, suggesting a common learning mechanism. Finally, while task feedback influences implicit decisions, explicit knowledge of scene statistics may develop separately, suggesting different learning mechanisms for these knowledge structures. 
Appendix A
An ideal observer for forced-choice identification with unknown cue validity
In this appendix, we derive the Bayesian ideal observer for the cue validity task with feedback. This observer optimizes task performance under the assumption that all errors are equally bad. As we shall see below, the observer operates by weighting likelihoods derived from the stimulus by the validity of the associated location cue, and then updates the estimate of cue validity based on the feedback from each trial. 
The ideal observer response with cue validity
Let c = 1,…, C identify the different cues in the task. We identify the observer response by
c ^
, the cue selected. In this task, there is also the possibility of no target being present in the stimulus with a known prior probability of 50%. To include this possibility for the stimuli (and responses) we add the response
c ^
= 0. Thus there are a total of C + 1 possible alternatives, and therefore C + 1 possible responses in any given trial. 
For a static display where the possible targets are a known profile modulated by a random contrast, it is sufficient to represent the stimulus by the contrast value at each cue. We define a C-dimensional vector of contrasts, g, where each element, g c, is the target contrast at cue c. The ideal observer is able to perfectly represent contrast, and hence we presume it can access this vector. The task is to localize or reject a contrast increment, and hence the contrast at any given cue consists of a pedestal, p, noise, n c, and possibly a target increment, s. Thus if cue c does not contain the target, its contrast has a mean of p and a standard deviation of σ n. If cue c does contain the target, the mean is increased by s. The noise is independent across cues and Gaussian distributed which gives a likelihood function for no signal present (i.e. c = 0) of  
p ( g | 0 ) = c = 1 C 1 ( 2 π σ n 2 ) 3 exp ( 1 2 ( g c p σ n ) 2 ) .
(A1)
If a signal is present at cue c, the likelihood function is  
p ( g | c ) = p ( g | 0 ) exp ( 1 σ n 2 ( ( g c p ) s 1 2 s 2 ) ) .
(A2)
Taken together, Equations A1 and A2 specify the likelihood of each stimulus for all C + 1 possible cueing conditions. 
The ideal observer will choose the cued location that maximizes the posterior probability, p( cg). Under Bayes rule, this posterior distribution can be written in terms of the likelihood function in Equations A1 and A2, and a prior on each of the possible responses, π c, as  
p ( c | g ) = p ( g | c ) π c c = 0 C p ( g | c ) π c .
(A3)
When the cue validities are known, the prior probabilities are directly related to them, as we shall see below. When the validity is only known stochastically from prior observations, priors are estimated using a Bayesian framework. 
Since the denominator of Equation A3 is independent of c, the ideal observer can be formulated by choosing the cue index that maximizes the numerator according to  
c ^ I O = a r g m a x c ( p ( g | c ) π c ) .
(A4)
This is equivalent to choosing the cued location that maximizes the posterior probability of signal presence. 
Ideal observer with stochastically known cue validity
Equation A4 specifies the ideal observer response in terms of the prior probability on each c. This is not quite the same thing as cue validity in the task since there is a known 50% chance that no target is present, and therefore π 0 = 0.5. At the cued locations, the prior is related to the cue validity, v c, by π c = 0.5 v c. Note that the cue validities are themselves a probability distribution, and sum to one for c = 1,…, C
In order to deal with the problem of unknown cue validity, we introduce a prior distribution on u, the vector of all cue validities. When the prior is known, v c = u c. However, when there is uncertainty about cue probability, the resulting effective cue validity for location c is given by integrating over this prior to get  
v c = d u p ( c | u ) p ( u ) = d u u c p ( u ) .
(A5)
Thus cue validity in the presence of uncertainty is simply the expected cue probability averaged over all possible prior distributions. 
The Dirichlet prior and multinomial observations of cue collocations
Here we treat the case where knowledge of cue validity comes from previous trials in the supervised learning condition. Let q c be the number of times that each cue, c = 1,…, C, has been identified in previous trials. The sum of all the cue frequencies, Q = ∑ q c, should be roughly half the total number of previous trials since there is a 50% probability of no cue being present in any given trial. Given that there is some underlying cue validity, u, generating the various target locations, the conditional probability of cue-target collocation is multinomial, with a probability distribution of  
p ( q | u ) = Γ ( Q + 1 ) c = 1 C u c q c Γ ( q c + 1 ) ,
(A6)
where Γ is the Gamma function. However, the real problem is to know the distribution of the unknown cue validities given some observed cue collocation data. In this case the distribution of interest is the probability of u given the collocation data in q. Under the assumption that absent any collocation data all cue validities are equally probable, Bayes theorem specifies that the posterior distribution, p( uq), has a Dirichlet distribution  
p ( u | q ) = Γ ( C + Q ) c = 1 C u c q c Γ ( q c + 1 ) .
(A7)
The main difference between Equations A7 and A6 is the leading term, which reflects normalizing over u instead of q. The Dirichlet distribution also assumes the constraints that 0 ≤ u c ≤ 1, and ∑ u c = 1, which are consistent with the role of u as representing cue validities. The Dirichlet distribution is well defined for parameter values, q c > −1. However, for this work, where the parameters represent the number of previous observations for each location cue, the parameters will be constrained to nonnegative whole numbers. 
The effect of learning
Using the Dirichlet Distribution in Equation A7 as the prior on u in Equation A5 results in an effective cue validity of  
v c = q c + 1 C + Q .
(A8)
It is clear that in the absence of any collocation data (i.e. all q c = 0) the effective cue validities are uniform with v c = 1/ C. In the limit of a large amount of collocation data (i.e. q c ≫ 1 and QC), the effective validity converges to q c/ Q, which in turn converges to the underlying validity u c. This demonstrates the basic effect of supervised learning of collocation in this paradigm. As we run through more and more trials, we gain more information about the underlying cue validity through feedback providing collocation data. 
Naive and informed observer models
We can consider two modifications of the ideal observer that serve to demonstrate the effect of cue validity on optimal performance. The naive observer uses the stimulus likelihood in Equation A4 to detect and localize the target, but it sets all cue validities to v c = 1/ C. This effectively negates the effect of cue probability, leaving the decision variable only dependent on the likelihood. The naive observer is equivalent to the ideal observer only when all cues are known to be equally probable or when no information about cue validity is known and any distribution of validity is equally possible—as in the first trial our experiments. Since the naive observer cannot update cue validity, its ensemble performance should remain constant over sequential trials. Thus the ideal observer should initially perform like the naive observer and then improve with experience when validity is unequal across cues. 
At the other end of the spectrum, the informed observer also uses Equation A4 to detect and localize the target, but it has perfect knowledge of cue validity. In this case, v c is matched to the actual validity used to generate the experimental displays. The informed observer is equivalent to the ideal observer when cue validity is known, and thus will outperform the ideal observer derived for stochastically known cue validity. In the long run, the estimated cue validities in Equation A8 will converge to the true cue validity and thus performance of the ideal observer will approach that of the informed observer after many trials. 
Computation of performance efficiency
Performance of the models is estimated by averaging the performance results of five thousand Monte-Carlo runs. This process repeated for many different contrast levels in order to build lookup table (LUT) for matching human observer performance over a given range of trials. Linear interpolation is used to obtain target contrasts between LUT performance levels. We note that adjusting contrast in these models is equivalent to adjusting noise magnitude, σ n
Let s hum be the increment contrast in the human observer experiments, and let s mod, b be the contrast for the model needed to match human observer performance in trial block b determined from the LUT. The efficiency of human observer performance relative to the model is then given by  
η b = ( s mod , b s h u m ) 2 .
(A9)
 
Acknowledgments
This work was supported by National Institutes of Health Grant R01 EY015925 and by IC Postdoctoral Research Fellowship Program Grant 8-444069-23149 (icpostdoc.org). We thank Carter Phelps and Julia Charland for help running subjects, and Jerry Tietz for excellent technical assistance. 
Commercial relationships: none. 
Corresponding author: Jason A. Droll. 
Email: droll@psych.ucsb.edu. 
Address: Department of Psychology, University of California Santa Barbara, Santa Barbara, CA 93106, USA. 
Footnotes
Footnotes
1  While calculating d′ from hits and false alarms in a “localize or reject” task violates the underlying assumptions of in a standard detection task without localization response and also assumes an equal distribution of priors, this measure allows us to quantify a single measure of performance from hit rate and false alarm rate and may thus reveal performance differences between feedback conditions.
Footnotes
2  In some tasks, such as the localization task used presently, a larger multiplicative weighting is mathematically equivalent to an additive bias or a change in decision criterion. This mathematical equivalence does not hold for a yes/no task with uncertainty about target location.
Footnotes
3  Two of the sixty-three subjects reported values that added up to 0.95 and 1.01. A statistical analysis using normalized values for their cue validity estimates produced virtually identical results.
Footnotes
4  Note that our terminology is different from that used by Dosher and Lu (2000a, 2000b) and Lu and Dosher (1998, 1999). Our use of the term additive internal noise (Burgess, Wagner, & Jennings, 1981; Eckstein, Ahumada, & Watson, 1997) refers to the component that Dosher and Lu (2000a) refer to as stimulus enhancement. Additive internal noise should not be confused with the multiplicative noise in the Dosher and Lu (2000a, 2000b) model. In addition, a decrease in additive internal noise cannot be distinguished from an increase in sensory gain prior to the additive noise, thus we mention both possibilities as explanations for increased performance. Finally, we do not consider calculation or sampling efficiency (Gold, Bennett, & Sekuler, 1999) because our stimulus at each location has a single sample of noise eliminating the possibility of inefficient integration within each of the individual locations.
References
Abbey, C. K. Pham, B. T. Shimozaki, S. S. Eckstein, M. P. (2008). Contrast and stimulus information effects in rapid learning of a visual task. Journal of Vision, 8, (2):8, 1–14, http://journalofvision.org/8/2/8/, doi:10.1167/8.2.8. [PubMed] [Article] [CrossRef] [PubMed]
Barlow, H. (1989). Unsupervised learning. Neural Computation, 1, 295–311. [CrossRef]
Barlow, H. B. (1980). The absolute efficiency of perceptual decisions. Philosophical Transaction of the Royal Society of London B: Biological Sciences, 290, 71–82. [PubMed] [CrossRef]
Bartlett, M. S. Sejnowski, T. J. Bower, J. M. (1996). Unsupervised learning of invariant representations of faces through temporal association. Computational neuroscience: International review of neurobiology. (pp. 317–322). San Diego, CA: Academic Press.
Basso, M. A. Wurtz, R. H. (1998). Modulation of neuronal activity in superior colliculus by changes in target probability. Journal of Neuroscience, 18, 7519–7534. [PubMed] [Article] [PubMed]
Bechara, A. Damasio, H. Tranel, D. Damasio, A. R. (1997). Deciding advantageously before knowing the advantageous strategy. Science, 275, 1293–1295. [PubMed] [CrossRef] [PubMed]
Beutter, B. R. Eckstein, M. P. Stone, L. S. (2003). Saccadic and perceptual performance in visual search tasks: I Contrast detection and discrimination. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 20, 1341–1355. [PubMed] [CrossRef] [PubMed]
Biederman, I. Mezzanotte, R. J. Rabinowitz, J. C. (1982). Scene perception: Detecting and judging objects undergoing relational violations. Cognitive Psychology, 14, 143–177. [PubMed] [CrossRef] [PubMed]
Biederman, I. Teitelbaum, R. C. Mezzanotte, R. J. (1983). Scene perception: A failure to find a benefit from prior expectancy or familiarity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 9, 411–429. [PubMed] [CrossRef] [PubMed]
Bishop, C. M. (1995). Neural networks for pattern recognition. USA: Oxford University Press.
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436. [PubMed] [CrossRef] [PubMed]
Burgess, A. E. Wagner, R. F. Jennings, R. J. (1981). Efficiency of human visual signal discrimination. Science, 214, 93–94. [PubMed] [CrossRef] [PubMed]
Carpenter, R. H. Williams, M. L. (1995). Neural computation of log likelihood in control of saccadic eye movements. Nature, 377, 59–62. [PubMed] [CrossRef] [PubMed]
Cave, K. R. Bichot, N. P. (1999). Visuospatial attention: Beyond a spotlight model. Psychonomic Bulletin & Review, 6, 204–223. [PubMed] [CrossRef] [PubMed]
Chen, X. Zelinsky, G. J. (2006). Real-world visual search is dominated by top-down guidance. Vision Research, 46, 4118–4133. [PubMed] [CrossRef] [PubMed]
Chun, M. M. Jiang, Y. (1998). Contextual cueing: Implicit learning and memory of visual context guides spatial attention. Cognitive Psychology, 36, 28–71. [PubMed] [CrossRef] [PubMed]
Chun, M. M. Jiang, Y. (1999). Top-down attentional guidance based on implicit learning of visual covariation. Psychological Science, 10, 360–365. [CrossRef]
Chun, M. M. Nakayama, K. (2000). On the functional role of implicit visual memory for the adaptive deployment of attention across scenes. Visual Cognition, 7, 65–81. [CrossRef]
Cleeremans, A. Jimenez, L. French, R. M. Cleeremans, A. (2002). Implicit learning and consciousness: A graded, dynamic perspective. Implicit learning and consciousness. (pp. 1–40). Hove, UK: Psychology Press.
Corbetta, M. (1998). Frontoparietal cortical networks for directing attention and the eye to visual locations: Identical, independent, or overlapping neural systems? Proceedings of the National Academy of Sciences of the United States of America, 95, 831–838. [PubMed] [Article] [CrossRef] [PubMed]
Corbetta, M. Akbudak, E. Conturo, T. E. Snyder, A. Z. Ollinger, J. M. Drury, H. A. (1998). A common network of functional areas for attention and eye movements. Neuron, 21, 761–773. [PubMed] [Article] [CrossRef] [PubMed]
Daw, N. D. O'Doherty, J. P. Dayan, P. Seymour, B. Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441, 876–879. [PubMed] [CrossRef] [PubMed]
Dayan, P. Kakade, S. Montague, P. R. (2000). Learning and selective attention. Nature Neuroscience, 3, 1218–1223. [PubMed] [CrossRef] [PubMed]
Dayan, P. Zemel, R. S. (1999). Statistical models and sensory attention. Paper presented at the International Conference on Artificial Neural Networks.
Ding, L. Hikosaka, O. (2006). Comparison of reward modulation in the frontal eye field and caudate of the macaque. Journal of Neuroscience, 26, 6695–6703. [PubMed] [Article] [CrossRef] [PubMed]
Dosher, B. A. Lu, Z. L. (1998). Perceptual learning reflects external noise filtering and internal noise reduction through channel reweighting. Proceedings of the National Academy of Sciences of the United States of America, 95, 13988–13993. [PubMed] [Article] [CrossRef] [PubMed]
Dosher, B. A. Lu, Z. L. (2000a). Mechanisms of perceptual attention in precuing of location. Vision Research, 40, 1269–1292. [PubMed] [CrossRef]
Dosher, B. A. Lu, Z. L. (2000b). Noise exclusion in spatial attention. Psychological Science, 11, 139–146. [PubMed] [CrossRef]
Droll, J. A. Gigone, K. Hayhoe, M. M. (2007). Learning where to direct gaze during change detection. Journal of Vision, 7, (14):6, 1–12, http://journalofvision.org/7/14/6/, doi:10.1167/7.14.6. [PubMed] [Article] [CrossRef] [PubMed]
Duda, R. O. Hart, P. E. Stork, D. G. (2000). Pattern classification. New York: Wiley-Interscience.
Eckstein, M. P. Abbey, C. K. Pham, B. T. Shimozaki, S. S. (2004). Perceptual learning through optimization of attentional weighting: Human versus optimal Bayesian learner. Journal of Vision, 4, (12):3, 1006–1019, http://journalofvision.org/4/12/3/, doi:10.1167/4.12.3. [PubMed] [Article] [CrossRef]
Eckstein, M. P. Ahumada, Jr., A. J. Watson, A. B. (1997). Visual signal detection in structured backgrounds: II Effects of contrast gain control, background variations, and white noise. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 14, 2406–2419. [PubMed] [CrossRef] [PubMed]
Eckstein, M. P. Beutter, B. R. Pham, B. T. Shimozaki, S. S. Stone, L. S. (2007). Similar neural representations of the target for saccades and perception during search. Journal of Neuroscience, 27, 1266–1270. [PubMed] [Article] [CrossRef] [PubMed]
Eckstein, M. P. Beutter, B. R. Stone, L. S. (2001). The information limits of saccadic targeting during search. Perception, 30, 1389–1401. [CrossRef] [PubMed]
Eckstein, M. P. Dresher, B. A. Shimozaki, S. S. (2006). Attentional cues in real scenes, saccadic targeting, and Bayesian priors. Psychological Science, 17, 973–980. [PubMed] [CrossRef] [PubMed]
Eckstein, M. P. Liston, D. Krauzlis, R. J. (2007). Non-equivalence between attentional modulation and increase in signal contrast for superior colliculus neurons..
Eckstein, M. P. Pham, B. T. Shimozaki, S. S. (2004). The footprints of visual attention during search with 100% valid and 100% invalid cues. Vision Research, 44, 1193–1207. [PubMed] [CrossRef] [PubMed]
Eckstein, M. P. Shimozaki, S. S. Abbey, C. K. (2002). The footprints of visual attention in the Posner cueing paradigm revealed by classification images. Journal of Vision, 2, (1):3, 25–45, http://journalofvision.org/2/1/3/, doi:10.1167/2.1.3. [PubMed] [Article] [CrossRef]
Fei-Fei, L. Fergus, R. Perona, P. (2006). One-shot learning of object categories. IEEE Transaction on Pattern Analysis and Machine Intelligence, 28, 594–611. [PubMed] [CrossRef]
Findlay, J. M. (1997). Saccade target selection during visual search. Vision Research, 37, 617–631. [PubMed] [CrossRef] [PubMed]
Fine, I. Jacobs, R. A. (2002). Comparing perceptual learning tasks: A review. Journal of Vision, 2, (2):5, 190–203, http://journalofvision.org/2/2/5/, doi:10.1167/2.2.5. [PubMed] [Article] [CrossRef]
Fiser, J. Aslin, R. N. (2001). Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychological Science, 12, 499–504. [PubMed] [CrossRef] [PubMed]
Geng, J. J. Behrmann, M. (2002). Probability cuing of target location facilitates visual search implicitly in normal participants and patients with hemispatial neglect. Psychological Science, 13, 520–525. [PubMed] [CrossRef] [PubMed]
Geng, J. J. Behrmann, M. (2005). Spatial probability as an attentional cue in visual search. Perception & Psychophysics, 67, 1252–1268. [PubMed] [CrossRef] [PubMed]
Glimcher, P. W. (2003). The neurobiology of visual-saccadic decision making. Annual Review of Neuroscience, 26, 133–179. [PubMed] [CrossRef] [PubMed]
Glimcher, P. W. Rustichini, A. (2004). Neuroeconomics: The consilience of brain and decision. Science, 306, 447–452. [PubMed] [CrossRef] [PubMed]
Gold, J. Bennett, P. J. Sekuler, A. B. (1999). Signal but not noise changes with perceptual learning. Nature, 402, 176–178. [PubMed] [CrossRef] [PubMed]
Goldberg, M. E. Bisley, J. Powell, K. D. Gottlieb, J. Kusunoki, M. (2002). The role of the lateral intraparietal area of the monkey in the generation of saccades and visuospatial attention. Annals of the New York Academy of Sciences, 956, 205–215. [PubMed] [CrossRef] [PubMed]
Hayhoe, M. Ballard, D. (2005). Eye movements in natural behavior. Trends in Cognitive Sciences, 9, 188–194. [PubMed] [CrossRef] [PubMed]
Hidalgo-Sotelo, B. Oliva, A. Torralba, A. (2005). Human learning of contextual priors on object search: Where does the time go? Paper presented at the Proceedings of the 3rd Workshop on Attention and Performance in Computer Vision, Int. CVPR.
Hikosaka, O. Takikawa, Y. Kawagoe, R. (2000). Role of the basal ganglia in the control of purposive saccadic eye movements. Physiological Review, 80, 953–978. [PubMed] [Article]
Hollingworth, A. Henderson, J. M. (1998). Does consistent scene context facilitate object perception? Journal of Experimental Psychology: General, 127, 398–415. [PubMed] [CrossRef] [PubMed]
Hollingworth, A. Henderson, J. M. (2002). Accurate visual memory for previously attended objects in natural scenes. Journal of Experimental Psychology: Human Perception and Performance, 28, 113–136. [CrossRef]
Ikeda, T. Hikosaka, O. (2003). Reward-dependent gain and bias of visual responses in primate superior colliculus. Neuron, 39, 693–700. [PubMed] [Article] [CrossRef] [PubMed]
Ipata, A. E. Gee, A. L. Goldberg, M. E. Bisley, J. W. (2006). Activity in the lateral intraparietal area predicts the goal and latency of saccades in a free-viewing visual search task. Journal of Neuroscience, 26, 3656–3661. [PubMed] [Article] [CrossRef] [PubMed]
Itti, L. Koch, C. (2001). Computational modeling of visual attention. Nature Reviews, Neuroscience, 2, 194–203. [PubMed] [CrossRef]
Jiang, Y. Leung, A. W. (2005). Implicit learning of ignored visual context. Psychonomic Bulletin & Review, 12, 100–106. [PubMed] [CrossRef] [PubMed]
Jiménez, L. Vaquero, J. M. Lupiáñez, J. (2006). Qualitative differences between implicit and explicit sequence learning. Journal of Experimental Psychology: Learning Memory and Cognition, 32, 475–490. [PubMed] [CrossRef]
Kersten, D. Mamassian, P. Yuille, A. (2004). Object perception as Bayesian inference. Annual Review of Psychology, 55, 271–304. [PubMed] [CrossRef] [PubMed]
Kinchla, R. A. Chen, Z. Evert, D. (1995). Precue effects in visual search: Data or resource limited? Perception & Psychophysics, 57, 441–450. [PubMed] [CrossRef] [PubMed]
Kowler, E. Anderson, E. Dosher, B. Blaser, E. (1995). The role of attention in the programming of saccades. Vision Research, 35, 1897–1916. [PubMed] [CrossRef] [PubMed]
Lewicki, P. Hill, T. Czyzewska, M. (1992). Nonconscious acquisition of information. American Psychologist, 47, 796–801. [PubMed] [CrossRef] [PubMed]
Lu, Z. L. Dosher, B. A. (1998). External noise distinguishes attention mechanisms. Vision Research, 38, 1183–1198. [PubMed] [CrossRef] [PubMed]
Lu, Z. L. Dosher, B. A. (1999). Characterizing human perceptual inefficiencies with equivalent internal noise. Journal of the Optical Society of America A, Optics, Image Science, and Vision, 16, 764–778. [PubMed] [CrossRef] [PubMed]
Luck, S. J. Hillyard, S. A. Mouloua, M. Hawkins, H. L. (1996). Mechanisms of visual-spatial attention: Resource allocation or uncertainty reduction? Journal of Experimental Psychology: Human Perception and Performance, 22, 725–737. [PubMed] [CrossRef] [PubMed]
Maljkovic, V. Nakayama, K. (1994). Priming of pop-out: I Role of features. Memory & Cognition, 22, 657–672. [PubMed] [CrossRef] [PubMed]
Maljkovic, V. Nakayama, K. (1996). Priming of pop-out: II The role of position. Perception & Psychophysics, 58, 977–991. [PubMed] [CrossRef] [PubMed]
Mardia, K. V. Kent, J. T. Bibby, J. M. (1979). Multivariate analysis. San Diego: Academic.
Mazer, J. A. Gallant, J. L. (2003). Goal-related activity in V4 during free viewing visual search Evidence for a ventral stream visual salience map. Neuron, 40, 1241–1250. [PubMed] [Article] [CrossRef] [PubMed]
McCoy, A. N. Platt, M. L. (2005). Expectations and outcomes: Decision-making in the primate brain. Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology, 191, 201–211. [PubMed] [CrossRef]
Michel, M. M. Jacobs, R. A. (2007). Parameter learning but not structure learning: A Bayesian network model of constraints on early perceptual learning. Journal of Vision, 7, (1):4, 1–18, http://journalofvision.org/7/1/4/, doi:10.1167/7.1.4. [PubMed] [Article] [CrossRef] [PubMed]
Niebles, J. Wang, H. Fei-Fei, L. (2006). Unsupervised learning of human action categories using spatial-temporal words. Paper presented at the British Machine Vision Conference, Edinburgh.
Palmer, J. Ames, C. T. Lindsey, D. T. (1993). Measuring the effect of attention on simple visual search. Journal of Experimental Psychology: Human Perception and Performance, 19, 108–130. [PubMed] [CrossRef] [PubMed]
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. [PubMed] [CrossRef] [PubMed]
Peterson, M. S. Kramer, A. F. (2001). Attentional guidance of the eyes by contextual information and abrupt onsets. Perception & Psychophysics, 63, 1239–1249. [PubMed] [CrossRef] [PubMed]
Platt, M. L. (2002). Caudate clues to rewarding cues. Neuron, 33, 316–318. [PubMed] [Article] [CrossRef] [PubMed]
Platt, M. L. Glimcher, P. W. (1999). Neural correlates of decision variables in parietal cortex. Nature, 400, 233–238. [PubMed] [CrossRef] [PubMed]
Posner, M. (1980). The orienting of attention. Quarterly Journal of Experimental Psychology, 32, 3–25. [CrossRef] [PubMed]
Rajashekar, U. Bovik, A. C. Cormack, L. K. (2006). Visual search in noise: Revealing the influence of structural cues by gaze-contingent classification image analysis. Journal of Vision, 6, (4):7, 379–386, http://journalofvision.org/6/4/7/, doi:10.1167/6.4.7. [PubMed] [Article] [CrossRef]
Rao, R. P. Zelinsky, G. J. Hayhoe, M. M. Ballard, D. H. (2002). Eye movements in iconic visual search. Vision Research, 42, 1447–1463. [PubMed] [CrossRef] [PubMed]
Roesch, M. R. Olson, C. R. (2003). Impact of expected reward on neuronal activity in prefrontal cortex, frontal and supplementary eye fields and premotor cortex. Journal of Neurophysiology, 90, 1766–1789. [PubMed] [Article] [CrossRef] [PubMed]
Schacter, D. L. (1987). Implicit memory: History and current status. Journal of Experimental Psychology: Learning, Memory and Cognition, 13, 501–518. [CrossRef]
Seitz, A. R. Nanez, J. E. Holloway, S. Tsushima, Y. Watanabe, T. (2006). Two cases requiring external reinforcement in perceptual learning. Journal of Vision, 6, (9):9, 966–973, http://journalofvision.org/6/9/9/, doi:10.1167/6.9.9. [PubMed] [Article] [CrossRef]
Seitz, A. Watanabe, T. (2005). A unified model for perceptual learning. Trends in Cognitive Sciences, 9, 329–334. [PubMed] [CrossRef] [PubMed]
Serre, T. Kouh, M. Cadieu, C. Knoblich, U. Kreiman, G. Poggio, T. (2005). A theory of object recognition: Computations and circuits in the feedforward path of the ventral stream in primate visual cortex..
Shaw, M. L. (1982). Attending to multiple sources of information: I The integration of information in decision-making. Cognitive Psychology, 14, 353–409. [CrossRef]
Shaw, M. L. Shaw, P. (1977). Optimal allocation of cognitive resources to spatial locations. Journal of Experimental Psychology: Human Perception and Performance, 3, 201–211. [PubMed] [CrossRef] [PubMed]
Shimozaki, S. S. Eckstein, M. P. Abbey, C. K. (2003). Comparison of two weighted integration models for the cueing task: Linear and likelihood. Journal of Vision, 3, (3):3, 209–229, http://journalofvision.org/3/3/3/, doi:10.1167/3.3.3. [PubMed] [Article] [CrossRef]
Sprague, N. Ballard, D. Robinson, A. (in press). ACM Transactions on Action and Perception.
Stuphorn, V. Taylor, T. L. Schall, J. D. (2000). Performance monitoring by the supplementary eye field. Nature, 408, 857–860. [PubMed] [CrossRef] [PubMed]
Sugrue, L. P. Corrado, G. S. Newsome, W. T. (2004). Matching behavior and the representation of value in the parietal cortex. Science, 304, 1782–1787. [PubMed] [CrossRef] [PubMed]
Sugrue, L. P. Corrado, G. S. Newsome, W. T. (2005). Choosing the greater of two goods: Neural currencies for valuation and decision making. Nature Reviews, Neuroscience, 6, 363–375. [PubMed] [CrossRef]
Sutton, R. S. Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press, A Bradford Book.
Tavassoli, A. van der Linde, I. Bovik, A. C. Cormack, L. K. (2007). An efficient technique for revealing visual search strategies with classification images. Perception & Psychophysics, 69, 103–112. [PubMed] [CrossRef] [PubMed]
Thompson, K. G. Bichot, N. P. Sato, T. R. (2005). Frontal eye field activity before visual search errors reveals the integration of bottom-up and top-down salience. Journal of Neurophysiology, 93, 337–351. [PubMed] [Article] [CrossRef] [PubMed]
Thompson, K. G. Hanes, D. P. Bichot, N. P. Schall, J. D. (1996). Perceptual and motor processing stages identified in the activity of macaque frontal eye field neurons during visual search. Journal of Neurophysiology, 76, 4040–4055. [PubMed] [PubMed]
Torralba, A. Oliva, A. Castelhano, M. S. Henderson, J. M. (2006). Contextual guidance of eye movements and attention in real-world scenes: The role of global features in object search. Psychological Review, 113, 766–786. [PubMed] [CrossRef] [PubMed]
Tversky, A. Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297–323. [CrossRef]
Walthew, C. Gilchrist, I. D. (2006). Target location probability effects in visual search: An effect of sequential dependencies. Journal of Experimental Psychology: Human Perception and Performance, 32, 1294–1301. [PubMed] [CrossRef] [PubMed]
Watanabe, K. Lauwereyns, J. Hikosaka, O. (2003). Neural correlates of rewarded and unrewarded eye movements in the primate caudate nucleus. Journal of Neuroscience, 23, 10052–10057. [PubMed] [Article] [PubMed]
Yu, A. J. Dayan, P. (2005). Uncertainty, neuromodulation, and attention. Neuron, 46, 681–692. [PubMed] [Article] [CrossRef] [PubMed]
Figure 1
 
“Localize or Reject” task. Subjects initiated each trial by fixating a central dot and pressing the space bar. Test stimuli included six colored circles, randomly arranged, each containing a contrast increment (e.g. Gaussian dot). Subjects were instructed to determine if one of the dots within the colored circles was a bright target among a set of less bright distractors. Following a mask of white noise, subjects used the mouse to either click on the cue in which they thought the target appeared, or to click on the word “No” that appeared randomly in one of the four corners of the screen, if they thought no target was present. Targets were present in half of all trials, and among these target-present trials the target would more frequently appear within some cues than other cues. Three feedback conditions are shown.
Figure 1
 
“Localize or Reject” task. Subjects initiated each trial by fixating a central dot and pressing the space bar. Test stimuli included six colored circles, randomly arranged, each containing a contrast increment (e.g. Gaussian dot). Subjects were instructed to determine if one of the dots within the colored circles was a bright target among a set of less bright distractors. Following a mask of white noise, subjects used the mouse to either click on the cue in which they thought the target appeared, or to click on the word “No” that appeared randomly in one of the four corners of the screen, if they thought no target was present. Targets were present in half of all trials, and among these target-present trials the target would more frequently appear within some cues than other cues. Three feedback conditions are shown.
Figure 2
 
Average proportion correct for saccadic (A) and perceptual (B) decisions for different conditions of feedback. Performance was generally better with more informative feedback, although all conditions showed improvement. (Note that percent correct for 1st saccades are low due to the naturally high false alarm rate in target-absent trials.) SEMs represent between subject differences.
Figure 2
 
Average proportion correct for saccadic (A) and perceptual (B) decisions for different conditions of feedback. Performance was generally better with more informative feedback, although all conditions showed improvement. (Note that percent correct for 1st saccades are low due to the naturally high false alarm rate in target-absent trials.) SEMs represent between subject differences.
Figure 3
 
Measure of human efficiency as compared the ideal observer using three different techniques of acquiring values of cue validity. Efficiency measures for the ideal observer when learning, or when informed of, cue validity are very similar, revealing the rapid and accurate estimate of cue validity for the ideal observer learning algorithm with supervised feedback.
Figure 3
 
Measure of human efficiency as compared the ideal observer using three different techniques of acquiring values of cue validity. Efficiency measures for the ideal observer when learning, or when informed of, cue validity are very similar, revealing the rapid and accurate estimate of cue validity for the ideal observer learning algorithm with supervised feedback.
Figure 4
 
General performance measures throughout the experiment. (Note that measures of saccadic performance maintain the arbitrary naming convention as used in the perceptual task of localizing or rejecting the target. However, subjects were not given any instruction on the use of eye movements to complete the perceptual task. See text for details.)
Figure 4
 
General performance measures throughout the experiment. (Note that measures of saccadic performance maintain the arbitrary naming convention as used in the perceptual task of localizing or rejecting the target. However, subjects were not given any instruction on the use of eye movements to complete the perceptual task. See text for details.)
Figure 5
 
Hit rate for saccades (A) and behavioral decisions (B) for each cue. Conditions with more informative feedback had a greater number of hits on probable target cues. Performance is averaged across cues with identical validity.
Figure 5
 
Hit rate for saccades (A) and behavioral decisions (B) for each cue. Conditions with more informative feedback had a greater number of hits on probable target cues. Performance is averaged across cues with identical validity.
Figure 6
 
False alarm rate for saccades (A) and behavioral decisions (B) for each cue. Note increased cueing effect in conditions with more informative feedback. Performance is averaged across cues with identical validity.
Figure 6
 
False alarm rate for saccades (A) and behavioral decisions (B) for each cue. Note increased cueing effect in conditions with more informative feedback. Performance is averaged across cues with identical validity.
Figure 7
 
Average start time for first fixation to a cue. Longer reaction time for the first saccade in reinforcement and supervised learning conditions may reflect planning saccadic decisions reflecting cue validity. The increased delay for saccadic decisions in supervised and reinforcement learning conditions parallels the cue effect found in those conditions.
Figure 7
 
Average start time for first fixation to a cue. Longer reaction time for the first saccade in reinforcement and supervised learning conditions may reflect planning saccadic decisions reflecting cue validity. The increased delay for saccadic decisions in supervised and reinforcement learning conditions parallels the cue effect found in those conditions.
Figure 8
 
Average number of fixations (A) and average total fixation duration (B) during the entire 2000 ms test stimulus.
Figure 8
 
Average number of fixations (A) and average total fixation duration (B) during the entire 2000 ms test stimulus.
Figure 9
 
Average explicit estimates of cue validity. (Actual cue colors were counterbalanced across subjects.) Despite the fact that subjects' saccadic and perceptual decisions did not reflect differences in cue validity in the unsupervised condition, subjects nevertheless were able to report reasonably accurate estimates of cue validity and were similar to those in conditions with supervised or reinforcement feedback.
Figure 9
 
Average explicit estimates of cue validity. (Actual cue colors were counterbalanced across subjects.) Despite the fact that subjects' saccadic and perceptual decisions did not reflect differences in cue validity in the unsupervised condition, subjects nevertheless were able to report reasonably accurate estimates of cue validity and were similar to those in conditions with supervised or reinforcement feedback.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×