Abstract
Observers can reliably perceive the average of an ensemble of stimuli (Watamaniuk et.al, 1992; Dakin et.al, 1997; Chong and Treisman, 2001), including high-level stimuli such as faces (Haberman and Whitney, 2007). However, it has been suggested that ensemble coding is simply averaging of foveal information over successive saccades. To test this, we performed a study in which subjects viewed sets of 24 faces with and without foveal information available. Face stimuli were selected from a set of 147 morphs, with expressions ranging from happy to sad to angry. After 1500 ms of free viewing, subjects were asked to report the mean expression of the ensemble. In one condition, subjects had their view of the array of faces unobstructed; in a critical second condition, a gaze-contingent circular occluder (2.6° in diameter) completely blocked foveal information. Subjects performed equally well across the occluded and non-occluded conditions, with no significant difference in response error. Additionally, when foveal information was occluded, subjects spent significantly more time fixating between faces in the ensemble (rather than directly on a face) compared to the unobstructed condition. In a follow-up experiment, we varied the proportion of faces that were visible in the set; we found that subjects' performance improved as more faces were presented, regardless of the foveal stimulation, indicating that they were integrating information from multiple faces in the display under both non-occluded and occluded conditions. Thus, while subjects adopted different fixation patterns between conditions, opting to fixate off of a face more in the occluded condition, performance on the task was unaffected by the loss of foveal information. This indicates that ensemble perception of faces is not determined solely by foveal input, but can operate entirely on information in the peripheral visual field.
Meeting abstract presented at VSS 2014