It is well known that visual perception (and selective attention) is capacity limited (Marois & Ivanoff,
2005). For example, few items can be selected or tracked at once (Scimeca & Franconeri,
2015). However, like low-level features (e.g., size, orientation, contrast), establishing a condensed ensemble representation for higher-order visual information, such as facial expressions, has been proved to be robust and flexible and is thought to provide an efficient way to overcome or cope with these limited-capacity bottlenecks in visual processing (Alvarez,
2011; Cohen, Dennett, & Kanwisher,
2016; Whitney et al.,
2014). A main finding supporting this assumption is that the accuracy of ensemble representation remains strikingly high even when individual representations are very poor (impoverished) or even practically lost due to limited attentional resources (Alvarez & Oliva,
2008; Alvarez & Oliva,
2009; Fischer & Whitney,
2011; Haberman & Whitney,
2009; Haberman & Whitney,
2011). The visual system can compensate for noisy local/individual representations by collapsing across those local features to represent the ensemble statistics. For example, when observers were blind to (local) changes in emotional expressions (i.e., they could not precisely localize which face actually changed its facial expression), they could nevertheless accurately report changes in the average emotion of the 16 faces shown in the set (Haberman & Whitney,
2011). Similarly, although participants were unaware of the emotional expression of the central face in the set due to crowding, it nonetheless did impact the perceived average emotion of the entire set (Fischer & Whitney,
2011). Additional evidence in favor of a capacity-unlimited process comes from findings showing that large set sizes yield comparable performance relative to small set sizes (mean emotion, see Haberman & Whitney,
2009; mean size, see Ariely,
2001; Chong & Treisman,
2003), and in some circumstances, performance was even better for the former compared to the latter condition (mean size, see Robitaille & Harris,
2011).