Our results agree with a simple, physiologically motivated model for pattern backward masking of natural scenes. This model presumes the major processing stages that are contained in many other models of visual processing (Kosslyn,
1999; Lennie,
1998; Marr,
1982; Sperling,
1963). There is a representation of the scene in a visual buffer, whether it is called raw primal sketch (Marr,
1982) or iconic memory (Sperling,
1963), that encodes the physical properties of the stimulus. This representation is continuously analyzed by higher level processing stages and transformed into more and more abstract entities, such as surfaces or objects that are transferred into memory. In this model, the pattern mask would exert its detrimental effect by overwriting the template of the stimulus in the sensory buffer and replacing it with a representation of the mask. Similar physiological mechanisms were suggested by Bullier (
2001) and Lamme and Roelfsema (
2000). The overwriting of the sensory buffer is possible because the mask produces a much stronger initial activation peak than the target (
Figure 5A and
5C) that is presumably due to the higher RMS contrast of the mask. Mask contrast has been shown to be an important determinant of the strength of a pattern mask (Turvey,
1973). Both masking by integration (Enns & Di Lollo,
2000; Turvey,
1973) and masking by competition (Keysers & Perrett,
2002) are compatible with this view. The sooner and more severe the disruption of the analysis of the template, the less information is extracted and the greater the reduction in recognition rates (Turvey,
1973). The visual buffer may be implemented in the retinotopic early visual areas (Bullier,
2001; Lamme & Roelfsema,
2000). This view is supported by studies in which the authors looked at the effects of masking at higher level visual-processing stages. A pattern mask decreased the neuronal responses to the target object and the information available in the spike train in macaque IT and STS when the SOA between target and pattern mask was reduced (Kovacs et al.,
1995; Rolls & Tovee,
1994; Rolls et al.,
1999). Keysers and Perrett (
2002) and Keysers, Xiao, Foldiak, and Perrett (
2001) found the same effect in STS neurons when the change rate in a rapid-serial-visual-presentation paradigm increased. In humans Grill-Spector et al. (
2000) found a reduction of the BOLD-fMRI-activation in shape and object-specific brain areas, such as LOC and the fusiform face area, when the target-to-pattern mask SOAs were sufficiently short. The reduction in activation correlated with the decrease in recognition rate at short SOAs.