A study by Ariely (
2001) embodies a central thread running through much research on perceptual averaging in visual short-term memory (VSTM). In that study, subjects were presented with memory sets comprising multiple circles whose diameters differed. Following this, a test stimulus with one or two probe circles was presented. The number of probes differed only with the nature of the test: yes/no (one probe) or two-alternative forced choice (two probes). Since results from the two tasks were similar, we will describe only the yes/no variant. Subjects judged the test stimulus in one of two ways. On some trials, they made a yes/no response as to whether the probe had been in the memory set (member identification). On other trials, the response indicated whether the probe circle was larger or smaller than the mean of the items' sizes in the memory set (mean discrimination). The results showed that despite mere chance performance on member identification, mean discrimination was quite good. On the basis of these results, Ariely speculated that “when presented with a set of four or more similar items, the visual system creates a representation of the set, and discards information about the individual items in the set” (p. 160). Furthermore, he suggested that the use of such statistics was adaptive: “The reduction of a set of similar items to a mean (or prototypical value), a range, and a few other important statistical properties may preserve just the information needed to navigate in the real world, to form a stable global percept, and to identify candidate locations of interest” (p. 161).
A major contribution to work on perceptual averaging was the establishment of attention's role in these phenomena. Studies of attention's impact have supported Ariely's conclusion that perceptual averages may be preserved even in the absence of memory for individual items themselves (Ariely,
2001; Corbett & Oriet,
2011). For instance, Alvarez and Oliva (
2008) presented subjects with sets of eight moving circles. The subjects' task was to count the number of times four of the moving circles (the target set) touched two red lines contained in the display. While making this judgment, subjects were to ignore the other four moving circles (the distracter set). Following this, a display was presented that contained either all but one of the circles, in their resting positions (individual test), or four of them in their resting positions (centroid test). In the individual test, subjects indicated the location that would have been occupied by the one missing circle. In the centroid test, they indicated the mean location of the four missing items. Alvarez and Oliva found that, though accuracy was higher for targets than distracters in the individual test, performance on the centroid test was equivalent for targets and distracters. This suggests that perceptual averages are preserved even for items that are filtered out of awareness by selective attention (see also Alvarez & Oliva,
2009).
Other studies of VSTM have used various tasks to explore different aspects of this averaging process. These tasks include motion detection (Ball & Sekuler,
1980), multiple-object tracking (Alvarez & Oliva,
2008), change detection (Alvarez & Oliva,
2009), rapid serial visual presentation (Corbett & Oriet,
2011), and Sternberg's memory scanning task (Dubé, Zhou, Kahana, & Sekuler,
2014), in addition to tasks in which subjects had to report some average feature of a briefly presented stimulus display (Chong & Treisman,
2005a,
2005b; Emmanouil & Treisman,
2008). These studies make it clear that perceptual averaging can operate on remembered direction, location, size, speed, texture, and even facial expression (Haberman, Harp, & Whitney,
2009).
Furthermore, perceptual averaging is not limited to stimuli in which elements are distributed over space, as in Ariely's and so many other studies. Perceptual averaging can also operate on elements that are distributed in time (e.g., Albrecht & Scholl,
2010; Corbett & Oriet,
2011; Dubé et al.,
2014; Haberman et al.,
2009). In one such study, Haberman et al. (
2009) presented subjects with a series of briefly presented face morphs that varied in emotional expression. Subjects were asked whether a probe face was more or less disgusted-looking than the average expression of the faces they had just seen. The results showed that subjects' judgments were quite accurate, demonstrating that perceptual averaging can occur across trials. Subsequent studies have extended this finding to paradigms involving abstract geometric stimuli of the sort commonly used in studies of visual search and change detection, as well as stimuli that change dynamically over time (e.g., Albrecht & Scholl,
2010). These important findings link perceptual averaging to earlier studies that suggest that average or prototypical features can be computed over successive trials in tasks involving comparative judgment (Morgan, Watamaniuk, & McKee,
2000), categorization (Busemeyer & Myung,
1988), and VSTM (Wilken & Ma,
2004). In other words, it seems that perceptual averages can be extracted from spatially defined sets of items or from temporally defined ones. This latter, temporal mode of averaging is important in part because it reminds us that an averaging process need not be restricted to events that occur within the experimenter-defined boundaries of an experimental trial (e.g., Morgan et al.,
2000). Instead, temporal averaging can and does extend to events and items that span such boundaries.
In the aggregate, these studies suggest that perceptual averaging may be integral to visual memory, and that it can interact with and influence functions such as feature matching, object updating, expectation-based monitoring, and perceptual binding (Alvarez,
2011; Treisman,
2006). Although the details of experimental tasks and stimuli vary considerably among these studies, all have in common with Ariely's study the requirement that subjects compute a perceptual average and communicate some direct or indirect indicator of that computation. On the basis of such studies alone, it is difficult to say whether the averaging mechanism is essential to the visual memory system or merely activated and driven by the instruction to compute such an average. Recent work from our laboratory (discussed later) suggests that perceptual averaging is in fact an essential, obligatory aspect of memory encoding, one that influences VSTM responses even in the absence of any instruction to compute or report an average.