The visual system can integrate discrete units of information to construct a coherent description of the input it receives. Little is known about the processes of grouping and their implementation in the visual system. Previously we described a configural effect in which the global arrangement (degree of co-circularity) of Gabor flankers affected the degree of crowding with a Gabor target that they surrounded. Here we tested possible mechanisms by which the configural effect might operate in crowding. We ruled out simple explanations based on the effect of basic units constructing these configurations (pairs of opposing Gabors). Our results support an explanation for crowding that is based on grouping processes between flankers. They also suggest that not all flankers necessarily directly affect the target. Flankers might group together and interact with the target as a single element. Finally, using a computational model of crowding based on compulsory grouping (following Gestalt principles) and segmentation, we define the relative contribution of different pair relations of grouped elements to crowding.

*SE*) of the mean log data.

*λ*and an

*SD*of the Gaussian envelope

*σ,*in this case

*λ*=

*σ*= 0.16°, of 80% contrast. Eight Gabor pairs were tested; those pairs were chosen from the three original configurations (smooth, interrupted, & sun— Figure 1—bottom right). The Gabors in each pair were either collinear or parallel to each other. They had one of three locations relative to the target-fixation plane: either above and below the target, left and right of it, or in a diagonal arrangement (there were two sets of diagonal pairs tested independently, but since the results were similar, we report them together). In addition, three of the subjects were tested with the three original configurations for comparison (see Figure 1).

*t*-tests with a Bonferroni correction to the significance level): the two diagonal pairs: the collinear pair (

*t*(5) = 5.26,

*p*= 0.003), and the parallel pair (

*t*(5) = 10.71,

*p*= 0.0001); and the horizontal collinear pair (

*t*(5) = 5.96,

*p*= 0.002) (uncorrected alpha). The other three pairs did not produce a significant threshold elevation. The results of the three original configurations were in agreement with our previous study, with the smooth configuration producing the least amount of crowding.

*F*(2,36) = 17.89,

*p*< 0.0001) and a marginal effect for the number of flankers (

*F*(3,36) = 2.84,

*p*= 0.052) and the interaction of the two factors (

*F*(2,36) = 3.2,

*p*= 0.053). Although the effect of local contrasts was significant, a post-hoc test indicated that the pattern of the results is at odds with the local contrast explanation we described above. Low local contrasts (found in the smooth configurations) were found to produce less interference than both the no-contrast conditions (

*p*< 0.0001) and the high-contrast conditions (

*p*= 0.004). The other two conditions were not significantly different. This means that there is a non-monotonic relation between the degree of local contrasts and the degree of crowding, in contrast to the prediction of the local contrasts explanation. Although the other two effects were only marginally significant we decided to analyze them as well. Post-hoc tests for the number of flankers' effect found no significant differences between any of the conditions. We tested the interaction by analyzing the effect of number of flankers separately for each local contrast group. The only significant effect was found in the no-contrasts group (

*F*(1,10) = 7.68,

*p*= 0.02), indicating more crowding in the collinear and parallel trios (configurations j & k, respectively) than in the pair configuration (i). For the other two local contrasts groups there was no such effect. This indicates that the total number of local contrasts is not predictive of the crowding in the display.

*λ*=

*σ*= 0.12°), of 90% contrast. Eight flankers were arranged around the target in either a smooth co-circular arrangement ( Figure 4a) or in a perpendicular arrangement relative to the imaginary circle (sun configuration— Figure 4d) at a fixed target–flanker separation of 0.61° (equal to 0.2 of the used eccentricity—E). Eight additional flankers were added farther away from the target arranged at one of the above arrangements ( Figures 4b, 4c, 4e, and 4f). Four different target–flanker separations were used for this second layer of flankers: 1.05°, 1.23°, 1.41°, & 1.6° (0.35E, 0.41E, 0.47E, & 0.53E), respectively. Each arrangement and separation was tested separately.

*t*-tests with a Bonferroni correction for multiple testing; the results were significant for all the conditions: smooth–smooth (

*t*(11) = 9.36,

*p*< 0.0001); smooth–sun (

*t*(11) = 11.7,

*p*< 0.0001); sun–sun (

*t*(11) = 13.76,

*p*< 0.0001); sun–smooth (

*t*(11) = 11.72,

*p*< 0.0001) (uncorrected alpha). Next, we calculated the additional threshold elevation produced by the outer eight flankers. This was done by subtracting from the measured threshold the threshold measured using only the inner eight flankers (either the smooth or the sun configuration). Results are presented in Figure 5 (lower graph). A 2-way ANOVA (configuration × separation) indicated a significant effect for configuration (

*F*(3,16) = 24.59,

*p*< 0.0001), but not for an outer layer separation from the target (

*F*(3,16) = 0.18,

*p*> 0.1). Finally, we performed another four single-sample

*t*-tests with a Bonferroni correction on these results and found that even after the subtraction, all the configurations showed significant crowding: smooth–smooth (

*t*(11) = 5.34,

*p*< 0.0001); smooth–sun (

*t*(11) = 13.85,

*p*< 0.0001); sun–sun (

*t*(11) = 4.95,

*p*< 0.0001); and sun–smooth (

*t*(11) = 4.10,

*p*= 0.002) (uncorrected alpha), although in some cases this elevation was rather small.

*D*

_{ e}is the two-dimensional Euclidean distance between the two patches (

*i*,

*j*) and

*f*

_{ ij}is a measure of their continuity defined as:

*θ*

_{ i}and

*θ*

_{ j}are the acute angles between Gabor patches

*g*

_{ i}and

*g*

_{ j}(correspondingly) and the line connecting their centers, and

*ϕ*

_{ ij}is the acute angle between the two virtual lines aligned to the two patches (all angles measured in radians). Variables a & b are constants (2.4 & 0.8, respectively). Their values are derived to optimize performance on a contour integration task independently of the crowding task as described in 2. Function

*f*emphasizes continuity by taxing (assigning high values to) deviation from continuity (see 2 for more details and Figure A1 for its output for different orientations). Although

*D*

_{ s}values may be best described as representing the strength of relations between the two elements in each pair, it is more convenient to consider them as distances between elements. We compute

*D*

_{ s}for each pair and connect each element to its closest neighbor(s). The connection is performed by a variation of Kruskal's minimum spanning tree algorithm (Kruskal, 1956). In our version, at each step, one distance value is considered (out of all the computed

*D*

_{s}s—starting with the smallest one and increasing it in consecutive steps). Each pair of elements that have the considered distance is connected (unless an indirect path connecting them has been established in previous steps). In the next step, the distance value is increased and the connection process is repeated. As in the original algorithm, our process stops when all the elements are connected by either a direct or an indirect connection. There is one main difference between our version and the original algorithm, leading to differences in their outputs. In Kruskal's original algorithm, at each step only one pair of elements is connected. In the next step, the same distance value is considered. If there are unconnected pairs that have this value, one of them is connected unless it is connected by an indirect path. If no such pair exists, the distance value is increased. The first difference in the outputs is that whereas in the original method we end up with a unique path between all elements, in our version we might end up with multiple alternative paths. The second difference is that since we connect pairs having the same distance value in parallel, there is only one grouping solution. In the original version, the connection is performed serially and therefore the solution may not be unique: it may depend on the order in which we chose to consider the pairs. Figure A2 displays the grouping pattern of several configurations. When the grouping stage is complete, we define for each Gabor a set (Ce) of other Gabors directly connected to it.

*i*), we compute an interference value (

*I*

_{ i}) based on its orientation relations with the flankers directly connected to it (

*j*, belonging to the set Ce—defined by the grouping process not including the target itself), using Equation A3:

*c*is a constant representing the baseline interference of a flanker. As a first approximation, we assume that regardless of its orientation each flanker produces the same baseline effect. This is in line with the observation that the location and local orientation anisotropy of flankers observed in Experiment 1 might not extend to the more complex configurations used in that experiment.

*G*(

*θ*

_{ i},

*θ*

_{ j}) is the contribution to crowding of a connected flankers' pair, based on their orientation relations. These relations can be divided into five types: collinear, parallel, co-circular, T-junction, and all other relations). We assume that each of the five relation types contributes a different degree of interference. For each configuration, we sum all the Is (influence of the flankers in the crowding zone). We treat this sum as the stimulus crowding factor ( Equation A4).

*G*( Equation A3). This is done by optimizing the values of constant (

*c*) and function

*G*( Equation A3) to produce crowding predictions for the relative interference of these configurations, which are in agreement with the measured results (

*r*

^{2}= [0.75, 0.76] for Experiments 2 and 3, respectively, the explained variance based on Pearson's correlation), see Figure A3. The values obtained by this method suggest that the strongest interference results from parallel pairs, followed by collinear, T-junction, and contrasting; with co-circular relations causing the least amount of interference (

*c*= 0.1,

*G*(

*θ*

_{ i},

*θ*

_{ j}) = [1.2, 5.1, −0.3, 0.7, 0.4], for relation types: collinear, parallel, co-circular, T-junction, and other, respectively).

*z*-scores across a & b pairs (for each display independently). Then for each of the target types, lines, circles, or ellipses, an average z-score was computed for each pair of a & b values. This gave us three vectors representing the relative efficiency of each pair for each target type. These vectors were sorted by efficiency (from high to low), and their first point of intersection (representing best overall efficient pair of values) was chosen for our crowding simulation (described in 1). In addition, we ran the simulation with some of the other (efficient) pairs and got results comparable with those reported in 1, although some differences in the grouping pattern were observed. This might allow for individual customization of the model for different subjects.

*D*

_{ s}values than those grouped in the previous iteration. And as a result of our chosen a & b values, nearly collinear or co-circular flankers will group at earlier iteration stages than other pairs. Therefore, at the end of each iteration, we can check whether any group might be our target configuration. The signal on which a decision would be based could be group saliency. Grouped elements might facilitate or synchronize each other, making their combined signal distinguishable from that of other elements in the stimuli. Since each of the separate groups created in the intermediate stages is defined in the model as independent from the rest (until grouped in a later iteration), localization information may be obtained as well for any group passing the detection threshold (based on retinotopic information). Figure B2 demonstrates several stages of grouping on a large Gabor array. A co-circular contour composed of twelve Gabor patches is grouped together after 28 iterations. To group together all the elements, the model completed 135 iterations. There are a few aspects that we have not included which should be considered in future work: 1) Control of the grouping pattern: It might be desirable to bias grouping toward a predefined pattern. Some serial models of CI suggest that later iterations should try to preserve the relation type of earlier connections (e.g., depending on previous connections, the next connection is preferred to be either the snake or ladder type, May & Hess, 2007). Similarly, in certain cases the algorithm might be biased toward circles rather than lines (if the target is known in advance or its type is primed). Furthermore, for higher-level stimuli, such as letters, this biasing might be even more complex and driven by stored templates (see below). 2) Our model does not describe enhancement of grouped elements, which might be a desired feature for certain tasks, as well as for group saliency estimation (see above). 3) Our model does not enhance detection based on closure, as some researchers suggest human subjects do (Kovács & Julesz, 1993; Mathes & Fahle, 2007). Although these properties were not included in the present version of the grouping algorithm, it is not difficult to see how they might be added without affecting our model's crowding estimation.