Perception of a visual object can be impaired when there are other objects nearby in the visual field (e.g., Bouma,
1970; Strasburger, Harvey, & Rentschler,
1991). This is known as the crowding effect. In the fovea, only a small crowding effect is usually observed. In the periphery, the spatial extent of crowding is approximately proportional to the eccentricity of the target object and reaches 0.5
E (
E—eccentricity of target) (Bouma,
1970; Toet & Levi,
1992). It seems that detection of simple visual features is very little affected by crowding (Andriessen & Bouma,
1976; Levi, Hariharan, & Klein,
2002; Pelli, Palomares, & Majaj,
2004), and the main mechanism of adverse interaction must therefore be located at some level after feature detection.
The exact nature of crowding, in a computational sense, is not clear. Theoretical models of crowding are based on an interaction between feature detectors (e.g., Bjork & Murray,
1977; Wolford & Chambers,
1984), or on pooling (integration) the signals of feature detectors over some larger area (e.g., Parkes, Lund, Angelucci, Solomon, & Morgan,
2001; Pelli et al.,
2004). Intriligator and Cavanagh (
2001) have argued that crowding is an effect of insufficient spatial resolution of attention which is limited by the large size of the receptive fields at some higher level of visual processing.
Several authors have suggested that some sort of positional noise may account for the crowding effects (Neri & Levi,
2006; Wolford,
1975). According to this account, visual features are incorrectly localized, or can migrate to incorrect locations, thereby producing errors of object perception. Simple pooling models assume a full loss of positional information within a certain field of integration. Parkes et al. (
2001) found that orientation discrimination thresholds in crowding conditions could be accounted for by a simple averaging of feature (orientation) values over a certain region. With similar stimuli, Baldassi, Megna, and Burr (
2006) obtained results that were not consistent with averaging and suggested that some other (non-linear) combination rule had to be used to model their data. Anyway, it is not clear how to extend these simple pooling rules to more complex stimuli with a number of features.
Pooling (integration) models should predict a strong effect of the number of distractors within the integration area, because with a larger number of distractors, the target signal is diluted in a larger amount of irrelevant activity. The results obtained with simple stimuli (Parkes et al.,
2001) are consistent with this prediction. With letters, the results are ambiguous: Strasburger et al. (
1991) found relatively strong effect of the number of flankers; Pelli et al. (
2004) found no difference between 2 and 4 flankers.
Many studies with alphanumeric characters have found that flanking objects are reported frequently instead of the target (e.g., Eriksen & Rohrbaugh,
1970; Huckauf & Heller,
2002; Strasburger,
2005). These studies suggest that positions of integrated objects rather than these of features are perceived incorrectly. Strasburger (
2005) explains this by imprecise focusing of attention. Some studies have suggested that object identification and localization errors are differentially affected by target–flanker distance (Butler & Currie,
1986) and exposure duration of stimuli (Styles & Allport,
1986). With alphanumeric characters, it is difficult to separate the roles of features and their combinations, because the relevant features are largely unknown. Wolford and Shum (
1980) used specially designed stimuli and reported some support for the idea that object localization errors are mediated by different (higher level) mechanisms as compared with feature localization errors.
There are different ideas about the effects of target–flanker similarity. Some early studies, using alphanumeric characters, found more degradation of performance with more similar flankers and argued for an interaction between mechanisms sensitive to similar features (e.g., Bjork & Murray,
1977). Estes (
1982), however, demonstrated that these effects could be explained by the criterion shifts induced by flankers. He found a strong bias towards the target similar or identical to the flankers, but discriminability of targets was little if at all affected by the target–flanker similarity. Also, he found that position errors were more frequent when target–flanker similarity was high.
Several recent studies have demonstrated that pop-out of the target with a unique visual feature can reduce the crowding effect (e.g., Felisberti, Solomon, & Morgan,
2005; Kooi, Toet, Tripathy, & Levi,
1994; Põder,
2006). The role of salience in more complex conditions with multidimensional and/or heterogeneous stimuli is not clear.
Illusory conjunctions are perceptual errors when visual features from different objects are incorrectly combined. Pelli et al. (
2004) drew attention to the similarity between conditions of crowding and these of illusory conjunctions. Both appear predominantly in the visual periphery, and when irrelevant objects are located not far from the target. While some early studies suggested that features can be incorrectly conjoined regardless of the distance between them (e.g., Treisman & Schmidt,
1982), the more recent studies have shown a strong effect of spatial proximity (e.g., Ashby, Prinzmetal, Ivry, & Maddox,
1996; Prinzmetal, Ivry, Beck, & Shimizu,
2002). The effect of target–distractor similarity is more controversial. Treisman and Schmidt (
1982) found that feature migrations causing illusory conjunctions were independent of similarity of the target and distractor objects on the other feature dimensions. Ivry and Prinzmetal (
1991) studied the migration of features when the similarity of objects on the same feature dimension was varied and they found more frequent illusory conjunctions with more similar features. Donk (
1999), however, reported an effect of similarity along other dimensions and no effect of similarity along the same dimension. In the studies of illusory conjunctions, different measures (dual tasks, backward masking) have been used in order to avoid focused attention to the target, and it is hard to say whether these findings are valid for more simple crowding displays.
The purpose of this study is to explore the role of features and their combinations in a simple crowding experiment. We use multidimensional stimuli with simple and well-defined visual features that can be combined to create a set of objects comparable to letters or numerals. We try to reveal the ways by which distractor objects and their features affect perception of the target, and whether this could be predicted by any simple model.