Free
Article  |   September 2015
Space and time in masking and crowding
Author Affiliations
  • Maria Lev
    Goldschleger Eye Research Institute, the Sackler Faculty of Medicine, Tel-Aviv University, Tel-Hashomer, Israel
    [email protected]
  • Uri Polat
    Goldschleger Eye Research Institute, the Sackler Faculty of Medicine, Tel-Aviv University, Tel-Hashomer, Israel
    [email protected]
Journal of Vision September 2015, Vol.15, 10. doi:https://doi.org/10.1167/15.13.10
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Maria Lev, Uri Polat; Space and time in masking and crowding. Journal of Vision 2015;15(13):10. https://doi.org/10.1167/15.13.10.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Masking and crowding are major phenomena associated with contextual modulations, but the relationship between them remains unclear. We have recently shown that crowding is apparent in the fovea when the time available for processing is limited, pointing to the strong relationship between crowding in the spatial and temporal domains. Models of crowding emphasize the size (acuity) of the target and the spacing between the target and flankers as the main determinants that predict crowding. Our model, which is based on lateral interactions, posits that masking and crowding are related in the spatial and temporal domains at the fovea and periphery and that both can be explained by the increasing size of the human perceptive field (PF) with increasing eccentricity. We explored the relations between masking and crowding using letter identification and contrast detection by correlating the crowding effect with the estimated size of the PF and with masking under different spatiotemporal conditions. We found that there is a large variability in PF size and crowding effects across observers. Nevertheless, masking and crowding were both correlated with the estimated size of the PF in the fovea and periphery under a specific range of spatiotemporal parameters. Our results suggest that under certain conditions, crowding and masking share common neural mechanisms that underlie the spatiotemporal properties of these phenomena in both the fovea and periphery. These results could explain the transfer of training gains from spatiotemporal Gabor masking to letter acuity, reading, and reduced crowding.

Introduction
Spatial domain
Contextual modulation is a general phenomenon that refers to changes in the appearance of patterns or objects when they are presented within the context of other patterns. Some well-known types of contextual modulations are visual masking (including center surround), crowding, grouping, and several types of size illusion. However, most of the current research interest focuses on masking and crowding. Usually when the target stimulus is presented alone, it is easy for the observer to perform the task, but presenting a mask within a small spatiotemporal window with the target can make the observer's task very difficult (Breitmeyer, 1984; Breitmeyer & Ogmen, 2000; Enns & Di Lollo, 2000; Francis, 2000; Herzog & Fahle, 2002; Levi, 2008; Pelli, Palomares, & Majaj, 2004; Polat, 1999; Polat & Sagi, 1993; Whitney & Levi, 2011; Woods, Nugent, & Peli, 2002). 
Visual masking is most common when the target is surrounded by a mask (Cannon & Fullenkamp, 1991; Yu, Klein, & Levi, 2003; Yu & Levi, 2000) or is embedded in texture (Chubb, Sperling, & Solomon, 1989). The literature on masking distinguishes between pattern masking, when the mask and target are presented at the same retinal location (i.e., ordinary masking), and lateral masking, when the mask location does not overlap with the target's location (Levi & Carney, 2011; Pelli et al., 2004; Polat & Sagi, 1993). 
Visual crowding can be defined as the inability to recognize objects in clutter; it sets a fundamental limit on conscious visual perception and object recognition throughout most of the visual field (Levi, 2008; Pelli et al., 2004; Whitney & Levi, 2011). Crowding is most pronounced in peripheral vision, in the fovea of people with strabismus amblyopia (Bonneh, Sagi, & Polat, 2007; Flom, Weymouth, & Kahneman, 1963; Levi, 2008; Whitney & Levi, 2011), or in the fovea for short presentation times (Lev, Yehezkel, & Polat, 2014). The crowding effect is usually measured when the target and flankers are visually nonoverlapping; thus, it parallels the lateral masking measurements. 
Both crowding and masking are affected by similar factors such as the distance between the flankers and the target, their relative similarities, and the global arrangement (Levi, 2008; Livne & Sagi, 2011; Pelli & Tillman, 2008; Polat & Sagi, 1993; Whitney & Levi, 2011) as well as grouping (Kovacs, 1996; Manassi, Sayim, & Herzog, 2012, 2013; Sterkin & Polat, 2008) and attention (Freeman, Sagi, & Driver, 2001; He, Cavanagh, & Intriligator, 1996; Yeshurun & Rashal, 2010). Thus, since both crowding and masking share similar properties, studies suggest that masking and crowding are related, especially in the fovea (Chung, Levi, & Legge, 2001; Lev, Yehezkel, et al., 2014; Levi & Carney, 2011; Polat & Sagi, 1993; Polat, Sterkin, & Yehezkel, 2007). 
However, the general view, supported by many studies, considers crowding to be unlike ordinary masking, especially at the periphery (Chakravarthi & Cavanagh, 2009; Danilova & Bondarko, 2007; Levi, 2008; Levi & Carney, 2011; Pelli et al., 2004; Pelli & Tillman, 2008; Strasburger & Malania, 2013; Whitney & Levi, 2011). Most theories of crowding and masking suggest the existence of multistage processing (Chakravarthi & Cavanagh, 2009; Chung et al., 2001; He et al., 1996; Levi, 2008; Levi & Carney, 2011; Levi, Hariharan, & Klein, 2002; Livne & Sagi, 2011; Neri & Heeger, 2002; Parkes, Lund, Angelucci, Solomon, & Morgan, 2001; Pelli et al., 2004; Pelli & Tillman, 2008), whereby in the first stage the features are detected independently and integrated for object recognition at later stages. However, there is still no consensus regarding the relationship between masking and crowding. One characteristic that seems to be acceptable to most people in the field concerns the task: Masking relates to feature detection and crowding relates to feature identification. The main aim of our study was to explore the relationship between masking and crowding in the fovea and periphery under different spatiotemporal conditions and to clarify the distinctions commonly made between the two. 
Temporal domain
Two stages of processing are typically considered in models of spatial vision: The first is detection, and the second is identification. Most models consider masking as the first stage and crowding as the second “identification” stage. It was found (Neri & Heeger, 2002) that the identification stage takes place 100 ms after the detection stage. This result is consistent with our recent finding (Lev, Yehezkel, et al., 2014) that extra processing time is needed to overcome the crowding effect in the fovea. It was also shown that crowding is affected by temporal factors (Chung & Patel, 2011; Lev, Yehezkel, et al., 2014). Moreover, a recent study showed that, even at the periphery, the critical distance for crowding is not fixed and that it depends on the presentation times (Chung & Mansfield, 2009; Tripathy, Cavanagh, & Bedell, 2014). Thus, taken together, there is emerging evidence supporting the idea that spatial crowding behaves differently under changing temporal conditions when the processing time is short. In this study we also aimed to explore the temporal domain that affects masking and crowding. 
Perceptive field
The perceptive field (PF), which is the psychophysical analog to the classical receptive field in the visual cortex (Jung & Spillmann, 1970), proposed as the processing unit of human visual perception, is of great interest for theoretical and practical aspects of vision. It has been suggested that the weighting functions of spatial filters that most resemble those of cortical simple cells have two to three cycles (Polat & Tyler, 1999; Watson, 1982; Watson, Barlow, & Robson, 1983) and that simple cells are matched to Gabor signals (Marcelja, 1980). A recent study, which used a reverse correlation technique, found that the PF is similar to that found in the visual cortex using a similar method (Neri & Levi, 2006). Interestingly, psychophysical studies of lateral masking that used Gabor patches provided similar estimations of about two to three cycles (λ) for the PF's size (Polat & Sagi, 1993; Zenger & Sagi, 1996). In this study, we use the term PF as a perceptual synonym for the physiological receptive field. 
The distinction between pattern and lateral masking is based on an implicit assumption that the boundaries between the inside (pattern) and outside (lateral parts) of the PF can be inferred from a visually apparent gap between the target and the mask. However, a direct estimate of the PF and how it affects masking and crowding is not yet known, especially at the periphery. We recently showed that the size of the PF increases with increasing eccentricity and that the effect of masking is related to this increase in the PF (Lev & Polat, 2011). Thus, additional aims of this study were to explore the spatial and temporal relationships between masking and crowding at the fovea and periphery and to determine whether they are related to the size of the PF. 
Working model, hypotheses, and aims
Our current in-progress descriptive working model is based on physiological and psychophysical data accumulated during the past few decades suggesting that lateral interactions may be involved in contextual modulations (Fitzpatrick, 2000; Kapadia, Westheimer, & Gilbert, 2000; Polat, 1999; Polat, Mizobe, Pettet, Kasamatsu, & Norcia, 1998; Polat & Norcia, 1998; Polat & Sagi, 1994a; Stettler, Das, Bennett, & Gilbert, 2002). The data show that lateral interactions modulate target visibility and, hence, masking. Results and models suggest that the modulations are derived by excitation (E) and inhibition (I) and that the E/I is determined by spatiotemporal parameters. However, whether crowding is affected by mechanisms similar to masking remains an unresolved question. We hypothesized that masking and crowding may share common mechanisms under a certain range of spatiotemporal parameters. Furthermore, we predicted that these relationships depend critically on the PF's size, which is dynamically changed by the spatiotemporal parameters that affect the E/I level. Hence, masking and crowding in the fovea and periphery may reveal stronger relationships than were previously assumed and reported. 
Support for our hypothesis that crowding and masking are related arises from results showing that training in contrast detection under masking conditions results in improvement in letter identification in young people with normal vision (Lev, Ludwig, et al., 2014), in presbyopes (Polat et al., 2012; Yehezkel, Sterkin, Lev, & Polat, 2015), in amblyopes (Polat, 2009b; Polat, Ma-Naim, Belkin, & Sagi, 2004; Polat, Ma-Naim, & Spierer, 2009), and in people with underdeveloped visual functions (Lev et al., 2015). These improvements were manifested in improved visual (letter) acuity, reduced crowding, and improved reading speed. Below are the hypotheses and assumptions behind the current study, whereas the details are presented in the Appendix
  1.  
    Crowding and masking are affected by the RF's size. (a) Crowding and masking are suppressive effects from inside the PF. Outside the PF, the suppressive effect is reduced by lateral facilitation; thus, crowding is absent or minimal from outside the PF. (b) The PF's size is dynamic, affected by contrast, target–flanker separation, global configuration, and presentation time. (c) The PF is larger for low-contrast targets, but it is decreased for high-contrast targets and for I. (d) The PF size is modulated by a neural network of E and I. (e) The target's contrast threshold under collinear facilitation (usually 3λ at the fovea and 5λ at the periphery), which reduces the inhibitory effect, reveals the optimal PF size under lateral interactions. Since the PF's size for isolated targets (no lateral I) and for targets under conditions of lateral interactions may be different, the PF size, when measured under conditions of lateral interactions and near a target's contrast threshold, is most suitable for exploring the relationships between masking and crowding. Thus, at the border of the PF, masking and crowding effects exhibit the strongest correlation.
  2.  
    Crowding and masking are processed by common neural interactions under certain spatiotemporal conditions in both the fovea and the periphery. (a) The neural interactions of E and I modulate target visibility. The E/I level is determined by spatiotemporal parameters such as contrast, target–flanker separation, global configuration, and presentation time. (b) I is pronounced for short target–flanker separations, high-contrast targets, and short processing times. E is pronounced for larger target–flanker separations, low-contrast targets, collinear configurations, and longer processing times. (c) Crowding is affected by I (suppression), whereas masking is affected by I and E (suppression and facilitation). Crowding is pronounced when the E/I level is shifted toward I. Crowding is released when the E/I level is shifted toward E or for minimal I, when the lateral E (lateral masking) takes place. (d) There is a critical distance (the border of the PF) and a critical time (dynamics of E and I) at which E/I reaches a certain level, which may indicate the PF's size. Isolating the optimal spatiotemporal conditions when E/I reaches this level reveals the optimal correlation between masking and crowding.
General method
Subjects
A total of 33 subjects participated in this study. The number of subjects that participated in each experiment is indicated in the relevant experiment. Their ages ranged from 17 to 40 years, and they had normal or corrected-to-normal visual acuity. The participants signed an informed consent form that was approved by the local Institutional Review Board of Sheba Medical Center. 
Apparatus
Stimuli were displayed as gray-level modulation on a Philips 107P color monitor (Philips, Eindhoven, The Netherlands). A Dell personal computer (Dell, Round Rock, TX) controlled the experiments. The stimuli were viewed from a distance of 150 cm in a dark room. The mean display luminance was 40 cd/m2 in an otherwise dark environment. Gamma correction was applied. When we used Gabor targets (Figure 1a, b), the stimuli were localized gray-level gratings (Gabor patches) with equal luminance distribution—standard deviation (σ), allowing a minimum of two cycles in the Gabor patches (σ = λ[wavelength])—modulated from a background luminance (40 cd/m2). Thus, at target–flanker separations of 3λ or more, there is no overlapping between the target and the flankers (Polat, 1999, 2009a; Polat & Sagi, 1994b). Two identical flankers with high contrast served as a mask. In the crowding experiments, we used E letters embedded in a matrix of E letters (Figure 1c through e), whereas the surround mask was identical to the target. The spatial separation between the target and the flanker (physical gap, interletter spacing) is measured in units of letter size (degrees). We also conducted a temporal crowding experiment using a temporal gap (interstimulus interval, ISI) between the E target and the matrix of E letters (Figure 1e). The type of the stimuli that we used in each experiment is indicated in the relevant Method section of that experiment. Screen resolution was 1024 × 768 pixels; for a viewing distance of 150 cm, it occupied a 9.2° × 12.2° area. Thus, 1° includes approximately 84 pixels (1024/12.2). Therefore, 1 arcmin includes 1.4 pixels (84/60), which is the standard vision of 6/6 (20/20). When converting to LogMar, 1 arcmin is equal to zero LogMar (log101). Thus, we can present the results as cycles per degree (cpd), visual angles (degrees, arcmin), or acuity units (logMar). Note that the size of the E letter is always five times larger than the acuity limit, which is determined by the resolution of the gap between the E strokes. More details are given in the relevant experiments. 
Figure 1
 
Example of stimuli used in our experiments. (a) Collinear and orthogonal. (b) Lateral masking with different target–mask separations. The lateral masking consisted of a target in the presence of two collinear flankers. At the top, each separation is indicated by λ (wavelength) units. (c) Single and crowded letters used to measure the crowding effect at the fovea. (d) Crowded letters with larger interletter spacing used to measure crowding in the periphery. (e) Temporal crowding with letters used in the periphery.
Figure 1
 
Example of stimuli used in our experiments. (a) Collinear and orthogonal. (b) Lateral masking with different target–mask separations. The lateral masking consisted of a target in the presence of two collinear flankers. At the top, each separation is indicated by λ (wavelength) units. (c) Single and crowded letters used to measure the crowding effect at the fovea. (d) Crowded letters with larger interletter spacing used to measure crowding in the periphery. (e) Temporal crowding with letters used in the periphery.
Procedures
To measure the effect of Gabor target detection, we used temporal two-alternative forced choice (2AFC) at the fovea and the yes–no procedures at the periphery. The suitability of using temporal 2AFC at the periphery was debated in earlier studies that measured lateral masking at the periphery due to possible eye movements, especially if the stimuli appeared at the same location in the second interval (Giorgi, Soong, Woods, & Peli, 2004; Lev & Polat, 2011; Shani & Sagi, 2005). For that reason, we used here the procedure used in Lev and Polat (2011). The task was either to detect a Gabor target in the masking experiments or to identify the direction of the letter E in the crowding experiments. Subjects were informed of a wrong answer by auditory feedback after each presentation throughout the experiment. A visible fixation circle appeared in the center before each trial and disappeared when the participants pressed the “ready” button, after which a blank screen appeared for a random ISI of between 300 and 800 ms; thereafter, the trial began. Each subject underwent a practice session before starting the experiment. The presentation time (duration) in each experiment may differ; therefore, it is given in the relevant location in each experiment. 
Data analysis
We used a paired two-tailed test to compare two conditions within the same subjects; the significance is reported as t and p values. When the p value was smaller than four zero digits, we reported it as p < 0.0005. We used a linear trend line fit to determine the correlations between conditions, and we reported the F value; the significance of the fit is indicated as r = √ (r2), and the p value equals the significance of the fit. 
Effects at the periphery
Experiment 1: Estimating the size of the PF
Introduction
A critical issue in our working model is estimating the PF's size and relating it to crowding. Many studies in humans could not define the underlying size of the PF, especially when using broadband stimuli such as letters or lines (Levi & Carney, 2011; Polat & Sagi, 1993). Thus, when considering lateral interactions, one should probe mainly within the PF integration, owing to the possibility of activating not only the optimal PF but also relatively large (lower frequency) PFs that cover both the target and the mask. 
Theoretical and experimental studies suggest that the optimal shape of stimuli that fit the receptive field of simple cells in the primary visual cortex are Gabor functions, with the standard deviation (σ) equal to wavelength (λ). The size of this optimal shape is 2λ to 3λ at the fovea (Marcelja, 1980; Neri & Levi, 2006; Polat & Sagi, 1993; Polat & Tyler, 1999; Watson, 1982; Watson et al., 1983) and about 5λ at the periphery at an eccentricity of 4° (Lev & Polat, 2011; see Appendix). Thus, masking (ordinary) and crowding effects may result from combining responses from the target and the mask within the same PF (integration within the PF, pattern masking) even when a gap can be seen, when using letters or in the periphery (Lev & Polat, 2011). Therefore, we propose that pattern and lateral masking may be inseparable unless the size of the underlying PF is clearly known. In this experiment we estimated the size of the PF, using the lateral masking paradigm (Lev & Polat, 2011; see Appendix), and explored the relationships among masking, crowding, and the size of the PF. 
Method
We used the lateral masking with a yes–no paradigm using Gabor patches (Figure 1a, b); this procedure was shown to be effective for measuring the effects of lateral masking at the fovea and periphery (Amiaz, Zomet, & Polat, 2011; Lev & Polat, 2011; Polat & Sagi, 2007; Zomet, Amiaz, Grunhaus, & Polat, 2008). Data collected using temporal 2AFC at the periphery may be challenged due to eye movements (Giorgi et al., 2004; Lev & Polat, 2011; Shani & Sagi, 2005). However, the yes–no procedure that we recently used overcame this potential problem and provided reliable data for measuring lateral interactions. 
Subjects were asked to detect a low-contrast target (14.42 ± 3.6, M ± SD) embedded between two collinear or orthogonal flankers. The contrast of the peripheral target was scaled to be 2.25 times larger than the foveal target's contrast, and the contrast of the peripheral flankers was 1.5 times larger than the foveal mask's contrast. The scaling was determined in pilot experiments and was consistent with previous studies (Foley, Varadharajan, Koh, & Farias, 2007; Lev & Polat, 2011). The target and flankers appeared randomly in the center, the left, or the right (4°) of the center. The spatial frequency of the target and the flankers was six cycles per degree and was presented for 60 ms. The target–flanker distances were 3λ, 5λ, and 7λ. Subjects reported whether the target was present (yes) or absent (no) by pressing the left or right mouse keys, respectively. The aim of the central task was to encourage foveal fixation and minimize attempts of eye movements to the periphery. It used target–flanker distances that were similar to those at the periphery and therefore were nonoptimal for the fovea, thus providing redundant data. The measurements of the false alarm (FA), miss, and hit as well as the correct rejection rates were the same as in our previous studies (Amiaz et al., 2011; Lev & Polat, 2011; Polat & Sagi, 2007; Zomet et al., 2008). Each condition was presented 50 times at each location (fovea, right, and left), with the target present in about one half of the trials (a probability of 0.5). We estimated the PF for each subject using the hit rate and d′. Twenty subjects participated in this experiment. 
Estimating the PF
At the periphery, measuring the response to the target alone may reveal higher thresholds owing to increased spatial uncertainty, and it may be less reliable than measuring it at the fovea (Lev & Polat, 2011; Levi & Carney, 2011). In addition, measuring the PF's size under the target alone or under lateral interactions may reveal different sizes (see Appendix). Therefore, since the orthogonal configuration displays no modulation effect outside the PF (Lev & Polat, 2011; Maniglia et al., 2011; Maniglia, Pavan, & Trotter, 2015; Shani & Sagi, 2005), some studies used the orthogonal configuration as a reference when measuring the masking effect at the periphery (see Appendix). As described above, in our previous study (Lev & Polat, 2011) and in the Appendix, the collinear/orthogonal ratio of 1 can be used to estimate the border of the PF. Thus, here we used the collinear/orthogonal ratio of 1 to estimate the border between the suppression and facilitation zones and the size of the suppressive zone. 
Results
Figure 2 shows the effects of target–flanker separations and the flanker's orientation regarding the contrast detection of the target's lateral masking. The average results for collinear and orthogonal configurations for sensitivity (d′) are presented in Figure 2a and c and for phit (percent of hit responses) in Figure 2b and d. A two-way repeated measures analysis of variance (ANOVA) orientation (collinear vs. orthogonal) × separation performed separately for phit and d′ revealed a significant interaction; phit: F(2, 38) = 33.54, p < 0.0005; d′: F(2, 38) = 27.57, p < 0.0005. At target–flanker separations of 7λ, the d′ value for the collinear and orthogonal configurations is not significantly different (d′: collinear, 1.95; orthogonal, 2.03; p = 0.72), but phit is significantly higher for collinear configurations (phit: collinear, 0.83; orthogonal, 0.66; p = 0.0016). However, at target–flanker separations of 3λ, d′ for the orthogonal configuration is significantly larger than for the collinear configuration (d′: collinear, 0.48; orthogonal, 2.39; p < 0.0005). The hit rate for a target–flanker distance of 3λ is high for the orthogonal and low for the collinear configuration (phit: collinear, 0.46; orthogonal, 0.81; p < 0.0005). 
Figure 2
 
Estimate of the PF. (a) Average d′ (y axis) against target–flanker separation (λ units, x axis) of the Gabor patches. The red line and closed triangles denote the orthogonal configuration, and the blue line with filled diamonds denotes the collinear configuration (see Figure 1a). (b) The same as for panel a but for phit. (c) The d′ ratio (collinear/orthogonal) y axis as a function of target–flanker separation (λ units, x axis). Each filled blue circle denotes the ratio for one subject. The solid line denotes the average of the data points. (d) The same as for panel c but using phit. The error bars denote the standard error of the mean (n = 20).
Figure 2
 
Estimate of the PF. (a) Average d′ (y axis) against target–flanker separation (λ units, x axis) of the Gabor patches. The red line and closed triangles denote the orthogonal configuration, and the blue line with filled diamonds denotes the collinear configuration (see Figure 1a). (b) The same as for panel a but for phit. (c) The d′ ratio (collinear/orthogonal) y axis as a function of target–flanker separation (λ units, x axis). Each filled blue circle denotes the ratio for one subject. The solid line denotes the average of the data points. (d) The same as for panel c but using phit. The error bars denote the standard error of the mean (n = 20).
The collinear configuration shows clear and significant trends of decreasing d′ and phit values with decreasing target–flanker distances (d′: 3λ = 0.48 vs. 7λ = 1.95, p < 0.0005; phit: 3λ = 0.46 vs. 7λ = 0.83, p < 0.0005). Likewise, d′ and phit values were significantly smaller for 5λ than for 7λ but were larger than 3λ (phit = 0.7, p < 0.00005 for both separations; d′ = 1.56, p < 0.0018 and p < 0.00005, respectively). Thus, this effect of reduced d′ and phit values for shorter target–flanker separations for the collinear configuration is indicative of a suppressive effect (masking) even though there is no overlapping between the target and the flankers at 3λ (Lev & Polat, 2011; Polat & Sagi, 2007). 
In contrast, for the orthogonal configuration, d′ does not display significant changes with a changing target–flanker distance (orthogonal, d′: 3λ = 2.39, 5λ = 1.8, 7λ = 2.03; p = 0.15 3λ vs. 7λ), whereas phit does tend to change with changing target–flanker distances, with a significant effect only between 3λ and 7λ (orthogonal, phit: 3λ = 0.81 vs. 7λ = 0.66; p = 0.003). Note that the relationships between changing target–flanker distances and phit are opposite for collinear and orthogonal configurations. This effect of a preferred response of orthogonal over the collinear configuration inside the PF is consistent with previous experimental data (Knierim & van Essen, 1992; Lev & Polat, 2011; Levitt & Lund, 1997; Polat et al., 1998). 
We estimated the PF size for each subject by calculating the ratio (collinear/orthogonal) for each target–flanker separation and fitted a linear line for the three target–flanker separations (Lev & Polat, 2011) for d′ (Figure 2c) and phit (Figure 2d). The ratio = 1 (collinear = orthogonal) was taken as the crossing border between the suppression and facilitation zones. The results (Figure 2c, d) show the individual calculations of the ratio for each subject (filled circles) and the average result (solid line) at each distance. Based on previous data, our pilot experiments, and the model's prediction regarding the spatial extent of the I and E, we set lower and upper boundaries for the PF's size: 2.5λ for the smallest size and 8λ for the largest size. Therefore, when the estimated border was higher than the boundary (phit: n = 1; d′: n = 4), it was set to 8λ. Likewise, when the estimate was lower than the boundary (phit: n = 1; d′: n = 1), it was set to 2.5λ. The average ratio was lowest for 3λ and highest at 7λ (d′: 3λ = 0.27 vs. 7λ = 0.96, p < 0.00005; phit: 3λ = 0.59 vs. 7λ = 1.42, p < 0.00005). The estimated ratio = 1 for the group indicates that the size of PF is between 5λ and 6λ. The ratio was smaller than 1 at 3λ for the majority of the subjects, indicating suppression, whereas facilitation was found for the majority of subjects for a larger target–flanker distance (7λ), consistent with our previous results (Lev & Polat, 2011). The estimated ratio (collinear/orthogonal = 1) was variable, leading to a variable estimation of the PF's size among subjects (d′: range = 2.5λ–8λ, 5.9λ ± 1.68λ, 1° ± 0.27°, M ± SD; phit: 2.5λ–8λ, 5.5λ ± 1.57λ, 0.91° ± 0.26°), consistent with the results of Lev and Polat (2011). 
Discussion
We replicated our previous finding (Lev & Polat, 2011) that the average size of the PF at 4° is about 5λ—more than twice the estimated size of the foveal PF (2λ). Our results confirm our suggestion that previous studies used nonoptimal target–flanker separations at the periphery to explore the effect of crowding (Chakravarthi & Pelli, 2011; Levi & Carney, 2011), masking, and collinear facilitation (Levi, Klein, & Hariharan, 2002; Shani & Sagi, 2005; Williams & Hess, 1998; Zenger-Landolt & Koch, 2001). 
One important point deduced from our data is that the masking effect (suppression) can be demonstrated at the periphery for a nonoverlapping target–flanker separation of 3λ. The effect of suppression is manifested by reduced phit values. This result of suppression for nonoverlapping target–flanker separations was revealed despite the effect of a stable FA, whereas phit values changed with distance, indicating that the effect was not due to a bias in reporting the targets present but instead was probably due to the contribution of the flankers and the target signals inside the PF (Chen & Tyler, 2008; Meirovithz et al., 2010; Polat & Sagi, 2007). This demonstrates the pattern-masking effect but under experimental conditions that are usually considered as lateral masking. This result supports our model's prediction that pattern and lateral masking may be inseparable in some cases, especially at the periphery, without taking into account the size of the underlying PF. 
Another important finding in our study was the large variability of the PF's size found among subjects. This effect may suggest that the spatial size of crowding is not fixed but instead is variable among subjects and presentation times (Tripathy et al., 2014). For a discussion about the relationships between the critical range of crowding as well as the PF's size and eccentricity, see Levi (2008), Pelli and Tillman (2008), and Whitney and Levi (2011). 
Comparison between phit and d′ and the effect of FA
Figure 3 shows that the correlation between the estimate of the PF's size using d′ and phit is high and significant: F(1, 18) = 23.76, r = 0.75, p = 0.00012. Thus, our results are consistent with our previous study indicating that the use of phit as a measure is accurate, that phit can be used as a reliable tool for estimating the PFs (Lev & Polat, 2011), and that it is consistent with the sensitivity measure of collinear facilitation when comparing yes–no and 2AFC (Polat & Sagi, 2007). Therefore, using the hit rate ratio to estimate the PF's size seems to be more stable and reliable. 
Figure 3
 
Correlation between the estimates of the PF. The y axis denotes the estimates using phit, and the x axis denotes the estimates using d′. Each filled circle denotes one subject. The solid line is the linear fit for the data (n = 20).
Figure 3
 
Correlation between the estimates of the PF. The y axis denotes the estimates using phit, and the x axis denotes the estimates using d′. Each filled circle denotes one subject. The solid line is the linear fit for the data (n = 20).
Consistent with previous studies, our results show that when using the yes–no paradigms and the mix-by-trial paradigm (Lev & Polat, 2011; Polat & Sagi, 2007; Zomet et al., 2008), d′ does not follow the expected results of collinear facilitation as measured by temporal 2AFC (Polat & Sagi, 2007). The difference between the two methods can be explained as high noise at the decision-making level, resulting from the high FA rate (in the yes–no method) that increases with decreasing target–flanker separation, producing a filling-in percept (Meirovithz et al., 2010; Polat & Sagi, 2007). However, the difference between the measure of d′ using yes–no and 2AFC is still under theoretical and experimental consideration and is beyond the scope of this article. Next, we explore the relationship between the PF's size and crowding. 
Experiment 2: Size of PF and letter identification (crowding)
Introduction
Flom et al. (1963) measured the contour interactions (crowding effect) at the fovea of people with normal vision and those with amblyopia. They found an effect at the fovea for target–flanker separations of less than 5 arcmin for people with normal vision but a much greater effect at the fovea of people with amblyopia. They concluded that the contour interaction effect is related to the visual acuity (the minimum angle of resolution) and that it is therefore related to the size of the receptive field. Accordingly, the crowding effect and the PF should be scaled with an eccentricity having a similar proportionality (scale-shift model; Levi, 2008). However, many studies exploring crowding at the periphery found that the range of the crowding effect is larger than what can be expected from the size of a single letter and concluded that the scale-shift model is incomplete and thus cannot explain crowding at the periphery (Levi, 2008; Levi, Hariharan, et al., 2002; Pelli & Tillman, 2008; Whitney & Levi, 2011). This discrepancy appears to be due to the major finding that the crowding range is not scaled with an increasing letter size at the periphery. An alternate model suggests that the critical spacing needed for identifying small letters is roughly half the eccentricity (Bouma, 1970). This relationship, known as Bouma's law, is one of the main characteristics used to test the crowding effect (Levi, 2008; Pelli & Tillman, 2008; Whitney & Levi, 2011). However, a deviation from the scale-shift model may arise from the use of a nonoptimal distance between the target and flankers, thus activating PFs larger than those assumed at the periphery. Moreover, although it is well known that the size of the PF increases with eccentricity, there are no direct ways of measuring the size of the PF at the periphery and of assessing direct relationships among the letter size, the PF's size, and crowding. 
We predicted that the size of the PF should increase with increasing eccentricity and correlate with the crowding effect, in line with the scale-shift model. Variability in the PF size across observers in the same eccentricity implies that the critical distance for crowding is not a precise value. The group average, on the other hand, may follow Bouma's law and allow us to make comparisons between the scale-shift model and Bouma's law. Here we explore how the PF's size is related to crowding at the periphery. 
Method
The targets were Tumbling E patterns that appeared randomly at the center, the left, or the right (4°) of the center with durations of 60 ms. A forced-choice paradigm was used in which the subjects were asked to detect whether the open side of a visible letter E (Figure 1c) was to the right or the left; subjects reported their answer by pressing the left or the right mouse key. We used a staircase of three up one down to determine the threshold (79% correct) of letter size (Bonneh et al., 2007). Spacing was always kept proportional to the letter's size (see Yehezkel et al., 2015). The thresholds for each location were measured using a separate staircase mechanism. There were about 50 trials per location, and the data from the right and left sides were combined, thus totaling 100 trials per data point. The foveal task served as a tool to force fixation at the fovea and to prevent eye movements to the periphery. The interletter spacing of two and four letters, which was used, is suitable for exploring crowding at the periphery. However, this interletter spacing was too large to show crowding at the fovea (Lev, Yehezkel, et al., 2014; Levi, 2008; Pelli et al., 2004; Whitney & Levi, 2011). Thus, as expected, the results show no crowding effect; therefore, they are not shown here. In Experiment 1 (Figure 2) and our previous study (Lev & Polat, 2011), we found that the average size of the PF is about 5λ to 6λ (approximately 0.8°–1.0°). Based on our pilot data and the estimated average size of the PF, we used two different interletter spacings: an interletter spacing of four letters, which was assumed to produce weak or no crowding, and a shorter interletter spacing of two letters, which produced strong crowding. The crowding effect was calculated as the difference between the thresholds (in logMar) under the crowded condition and the single letter (crowded minus single). The same 20 subjects participated in this experiment. 
Results
The threshold elevations of the minimal letter size (visual acuity) for crowded conditions are presented in Figure 4. The results show that the crowding effect (crowded minus noncrowded) is significantly larger for two than for four interletter spacings (letter size in log units; 0.46 vs. 0.22, p < 0.0005; single letter = 0.43; two interletter spacings = 0.89; four interletter spacings = 0.65), showing the known dependency of crowding regarding the distance between the target and flankers and regarding the letter size (acuity). 
Figure 4
 
The crowding effect using letter identification. The crowding effect (crowded minus a single letter) in logMar units (y axis) against two and four interletter spacings. The crowding effect is much larger for two interletter spacing. The error bars denote the standard error of the mean (n = 20).
Figure 4
 
The crowding effect using letter identification. The crowding effect (crowded minus a single letter) in logMar units (y axis) against two and four interletter spacings. The crowding effect is much larger for two interletter spacing. The error bars denote the standard error of the mean (n = 20).
Next, we correlated for each subject the size of the PF that was calculated in Experiment 1 (Figure 2) with the crowding effect (Figure 4) for four interletter spacings. The results (Figure 5) show a significant correlation, as measured using d′ (Figure 5a)—F(1, 18) = 7.14, r = 0.54, p = 0.015—and the hit rate (Figure 5b)—F(1, 18) = 33.45, r = 0.81, p < 0.00005—between the size of the PF and the target size under crowded conditions. Subjects with larger PF values exhibited stronger crowding, and those with smaller PF values exhibited weaker crowding. Figure 5c and d show that there is a slightly weaker correlation with the crowding effect (crowded minus noncrowded) for both calculations; d′: F(1, 18) = 4.42, r = 0.46, p = 0.49; phit: F(1, 18) = 22.81, r = 0.76, p = 0.00015. Possibly, this was because some subjects reached a very large letter size that prevented us from reliably measuring the effect with the crowded letters (the ceiling effect). Possibly for the same reason, the correlation with the crowding at two interletter spacings was not significant for the crowding—phit: F(1, 18) = 3.82, r = 0.41, p = 0.06—and for the crowding effect—phit: F(1, 18) = 8.3, r = 0.21, p = 0.37. Thus, more correlation was found between the size of the PF and crowding for larger interletter separations, suggesting that correlation is better near or at the border of the PF. Our model predicts that the correlation between the masking and crowding effects at the border of the PF is higher than for either inside or outside the PF since at the border the E/I reaches a level where the lateral E balances the local I. Indeed, we found a significant correlation between the masking and crowding effects at 5λ (the estimated border, Experiment 1; d′: F(1, 18) = 5.34, p = 0.034, r = 0.5; phit: F(1, 18) = 16.00, p = 0.0008, r = 0.68. However, inside the PF (3λ) the correlation is insignificant; d′: F(1, 18) = 3.62, p = 0.07, r = 0.4; phit: F(1, 18) = 3.31, p = 0.09, r = 0.38. At 7λ an insignificant effect was found for d′, but it was significant for phit; d′: F(1, 18) = 1.44, p = 0.24, r = 0.27; phit: F(1, 18) = 5.3, p = 0.033, r = 0.47. Thus, our results confirm our model's prediction that the crowding effect is related to the size of the PF. 
Figure 5
 
The correlation between crowding and the PF. The crowding condition y axis (panels a and b; degrees, in letter spacing, center to center) and the crowding effect (panels c and d; crowded minus single letter) against the estimated size of the PF (x axis). The top axis is in degrees, and the bottom axis is in λ units. Each data point denotes the estimated size of the PF for each subject. The solid line is the linear regression line; the correlation is indicated by r in each panel. Panels a and c used d′, whereas panels c and d used the phit measurement (n = 20).
Figure 5
 
The correlation between crowding and the PF. The crowding condition y axis (panels a and b; degrees, in letter spacing, center to center) and the crowding effect (panels c and d; crowded minus single letter) against the estimated size of the PF (x axis). The top axis is in degrees, and the bottom axis is in λ units. Each data point denotes the estimated size of the PF for each subject. The solid line is the linear regression line; the correlation is indicated by r in each panel. Panels a and c used d′, whereas panels c and d used the phit measurement (n = 20).
Discussion
The results of this experiment confirm our hypotheses and assumptions that the size of the PF and the crowding effect in letter identification are related. The results indicate a correlation between crowding, as measured by letter acuity, and the size of the PF for each subject (i.e., that the crowding effect is proportional to the size of the PF; Flom et al., 1963). Note that the range of crowding varies largely among subjects by about a factor of three (1.22–3.45°, 1.95 ± 0.52, M ± SD, center to center), unlike the fixed window of crowding suggested earlier at the periphery (Levi, 2008; Pelli & Tillman, 2008; Whitney & Levi, 2011). Thus, our results are consistent with our hypotheses and assumptions showing that the best correlation was found at the border of the PF for specific spatiotemporal parameters. Note also that the average range of crowding (center to center) is about 2° (center to center), which is consistent with the range expected from Bouma's law at 4° of eccentricity. 
Experiment 3: Temporal properties of letter identification (crowding) and the size of the PF
Introduction
Recently we showed that letter identification (crowding) in the fovea is affected by the time that the target is presented and that crowding increases when a matrix of similar flankers is presented after a short delay (ISI) of 30 or 60 ms, but that the crowding effect decreases with increasing delays (Lev, Yehezkel, et al., 2014). This is referred to here as temporal crowding of letters. This result is supported by a study that explored the relationships of temporal asynchrony of crowding (Chung & Patel, 2011). Importantly, it was shown that maximal crowding does not require the target and flankers to be presented simultaneously; conversely, simultaneous presentation of the target and flankers underestimates the crowding effect. Our model predicts that there is a spatiotemporal range at which masking and crowding are related. In Experiment 2 we explored the spatial range of the crowding in relation to the PF's size. Here our aim was to explore the temporal domain of crowding and the relationships with the PF's size. To this end, we used temporal parameters that exhibited a suppressive effect in temporal crowding or masking (Breitmeyer, 1984; Chung & Patel, 2011; Enns & Di Lollo, 2000; Lev, Yehezkel, et al., 2014; Polat & Sagi, 2006; Saarela & Herzog, 2008). 
Method
The crowding effect was measured using a letter identification task that was described before (Lev, Yehezkel, et al., 2014). Here we chose parameters that exhibit temporal effects in masking (Lev, Yehezkel, et al., 2014; Polat & Sagi, 2006) and crowding (Chung & Patel, 2011; Lev, Yehezkel, et al., 2014). We measured the threshold of identifying a single letter size (visual acuity) presented for 60 ms under two conditions: the target alone or the target under a temporal crowding condition using a matrix of E letters with an interletter spacing of two or four (the same that we used in Experiment 2). The size of the letters, measured using the same staircase as in Experiment 2, converged to 79% correct. The matrix of crowded letters appears after the appearance of the target alone (ISI = 60 ms, stimulus onset asynchrony [SOA] = 120; see Figure 1e), which did not overlap with the target's location. The effect of crowding was calculated as the difference between the threshold under temporal crowding and that of a single target. Sixteen subjects, out of the 20 who participated in the previous experiment, participated in this experiment. 
Results
As shown in Figure 6a, the temporal crowding effect (increased letter size) was significantly higher for two than for four interletter spacings (0.21 vs. 0.1°; p < 0.0005). This result of temporal crowding is reminiscent of the data found for spatial crowding and is presented in Figure 4. Thus, we compared spatial (simultaneous, SOA = 0) and temporal crowding for the 16 subjects who participated in both experiments. A two-way repeated measures ANOVA of interletter spacing (two vs. four) × SOA (0, 120 ms) revealed a significant effect of spacing, F(1, 15) = 20.3, p < 0.0005; and SOA, F(1, 15) = 43.2, p < 0.0005; and interaction, F(1, 15) = 11.28, p = 0.004. At an ISI of 60 ms and a spacing of two, the effect of crowding was significant (p = 0.00026), but at a spacing of four it was insignificant (p = 0.1). For spacing of two letters the effect of temporal crowding is correlated with the spatial crowding, F(1, 14) = 4.93, r = 0.51, p = 0.043. Here we found a correlation between the PF's size for each subject and the threshold elevation under the temporal crowding condition; d′: F(1, 14) = 4.63, r = 0.49, p = 0.049 (Figure 6b); phit: F(1, 14) = 9.14, r = 0.62, p = 0.0090 (Figure 6c). At four interletter spacings, many subjects exhibited no temporal crowding effect, consistent with the prediction of reduced crowding with increasing spatial separation. This relation between spatial and temporal crowding was replicated in previous studies of masking, showing that temporal masking is diminished with larger target–flanker separations (Breitmeyer, 1984; Polat & Sagi, 2006; Polat et al., 2007; Saarela & Herzog, 2008). 
Figure 6
 
Temporal crowding. (a) The temporal crowding effect (y axis) in logMAR units for two and four interletter spacings at 4° of eccentricity. The correlation between the effect of temporal masking in visual angles (y axis) against the estimated size of the PF (x axis, top axis in degrees, bottom axis in λ units). Each data point is the estimated size for each subject. (b) Using d′ as a measurement. (c) Using phit as a measurement. The error bars denote the standard error of the mean (n = 16).
Figure 6
 
Temporal crowding. (a) The temporal crowding effect (y axis) in logMAR units for two and four interletter spacings at 4° of eccentricity. The correlation between the effect of temporal masking in visual angles (y axis) against the estimated size of the PF (x axis, top axis in degrees, bottom axis in λ units). Each data point is the estimated size for each subject. (b) Using d′ as a measurement. (c) Using phit as a measurement. The error bars denote the standard error of the mean (n = 16).
Discussion
Our results confirmed our hypotheses and assumptions that temporal crowding, like spatial crowding, is correlated with the size of the PF. Subjects that have larger PFs are more susceptible to the effect of temporal crowding than are subjects with smaller PFs. This effect also increased with shorter target–flanker separations, consistent with the known effect of spatial crowding (Levi, 2008; Whitney & Levi, 2011). Thus, our major prediction that both spatial crowding and temporal crowding are related to the size of the PF is confirmed at the periphery, where it was shown that masking and crowding are similarly affected by spatial and temporal parameters. 
Our results regarding the effect of spatiotemporal crowding are also consistent with previous studies showing that crowding decreases when the spatial gap (distance) and temporal gap (ISI) increase between the target and flankers (Chung & Patel, 2011; Lev, Yehezkel, et al., 2014). This result of decreased crowding with increased SOA is consistent with our hypotheses and assumptions and our recent study (Lev, Yehezkel, et al., 2014) that crowding is strongest when the time available for target processing is limited, suggesting that it coincides with the short time constant of the I. Our results thus add to the overall conclusion that crowding is also affected by temporal factors (Greenwood, Sayim, & Cavanagh, 2014; Lev, Yehezkel, et al., 2014; Tripathy et al., 2014). Whereas several studies that characterized crowding focused on the spatial domain of the crowding (Danilova & Bondarko, 2007; Flom et al., 1963; Levi, 2008; Pelli et al., 2004; Strasburger & Malania, 2013; Whitney & Levi, 2011), the emerging results (Lev, Yehezkel, et al., 2014; Tripathy et al., 2014) may support the notion that crowding, like masking, may also be affected by temporal parameters at the periphery similarly to the fovea. Next, we explore the effects of crowding and masking at the fovea. 
Effects at the fovea
Experiment 4: Temporal effects of crowding using a letter identification task
Introduction
We showed (Experiment 1), as in a previous study (Lev & Polat, 2011), that the periphery and the fovea at an eccentricity of 4° have similar spatial behavior patterns of lateral interactions (collinear facilitation and suppression) when the target–flanker distance is properly scaled and that the effect of increased crowding at the periphery can be explained by the increased PF's size. We also showed a good correlation between PF size and temporal crowding. One main prediction and aim in this study was to find common spatiotemporal parameters that would enable us to develop a unified model for the fovea and periphery. We recently explored the temporal domain of crowding at the fovea and found that the normal fovea exhibits crowding for very brief presentation times and for short target–mask distances (Lev, Yehezkel, et al., 2014). Therefore, here we wanted to investigate whether crowding at the fovea is affected by a spatiotemporal domain like at the periphery. To this end, as described in the hypotheses and assumptions, we used the E and I characteristics to estimate the PF border (using a new paradigm that is explained next) and, like at the periphery, we measured crowding using letter identification (Lev, Yehezkel, et al., 2014) and correlated it to masking. 
Method
We measured crowding using the same paradigm that we used recently at the fovea (Lev, Yehezkel, et al., 2014). The targets were Tumbling E patterns presented at the fovea for durations of 30, 60, 120, and 240 ms. A forced-choice paradigm was used in which the subjects were asked to detect whether the open side of a visible letter E (Figure 1c) was to the right or left; subjects reported their answer by pressing the left or the right mouse key, respectively. We used a fixed size for the letters (7.2 arcmin) with two different interletter spacings (center to center) under crowding conditions (0.4 letter spacing = 10.08 arcmin; 1 letter spacing = 14.4 arcmin). A single target letter without crowding was measured as well. The experiments were conducted using a blocked procedure in which only one separation was used, and there were 100 trials per data point in the fixed-size experiment. The order of the blocks was random. Under the crowding condition, an array of randomly facing Es (flankers) surrounding the target was added. The percentages of correct answers for the target alone and under the crowding conditions were measured separately. The crowding effect was indicated by a reduction in the percentage correct under the crowding conditions relative to the target alone. Thirteen new subjects participated in this experiment. 
Results
Figure 7a shows the effect of crowding as a function of the presentation time for 0.4 interletter spacing. A two-way repeated measures ANOVA of interletter spacing (one vs. 0.4) × duration (30, 60, 120, and 240 ms) revealed a significant effect of spacing, F(1, 12) = 49.00, p < 0.0005; and duration, F(3, 36) = 30.05, p < 0.0005; and interaction, F(3, 36) = 20.3, p < 0.0005. Whereas the percentage correct for the target alone was reduced only slightly and not significantly for a presentation time of 30 ms, the performance under the crowded condition was reduced significantly for all presentation times (30, 60, 90, and 120 ms). These results are consistent with our results (Lev, Yehezkel, et al., 2014) and showed no crowding effect for one interletter spacing; thus, the data here are not shown. 
Figure 7
 
Letter identification (crowding effect) in the fovea as a function of presentation time. (a) Percentage correct (y axis) against the presentation time in milliseconds (x axis). The blue line and filled circles denote the target alone, and the red line and red filled circles denote the crowded conditions. The effect of crowding is significant for all presentation times but is maximal for the shorter ones. (b) The effect of the crowding effect (reduction in the percentage correct, y axis) as a function of presentation time (milliseconds). The solid line denotes the linear fit of the data points (r = 0.93, r2 = 0.864), showing that crowding is apparent at more than 240 ms. The error bars denote the standard error of the mean (n = 13).
Figure 7
 
Letter identification (crowding effect) in the fovea as a function of presentation time. (a) Percentage correct (y axis) against the presentation time in milliseconds (x axis). The blue line and filled circles denote the target alone, and the red line and red filled circles denote the crowded conditions. The effect of crowding is significant for all presentation times but is maximal for the shorter ones. (b) The effect of the crowding effect (reduction in the percentage correct, y axis) as a function of presentation time (milliseconds). The solid line denotes the linear fit of the data points (r = 0.93, r2 = 0.864), showing that crowding is apparent at more than 240 ms. The error bars denote the standard error of the mean (n = 13).
The crowding is reduced with increasing presentation time, reaching the no-crowding level (the critical time for crowding) with longer presentation times. For each subject, we calculated the reduction in the percentage correct (delta) between the crowded and noncrowded conditions for each presentation time. Thus, for each subject there are four data points. We used the convention that the crossing point of the linear fit with the no-change level (null effect, noise, or saturation) provides a reliable estimate of either the threshold or the critical point (Norcia & Tyler, 1985; Norcia, Tyler, Hamer, & Wesemann, 1989; Tyler, Apkarian, Levi, & Nakayama, 1979). The results, presented in Figure 7b, show the average crowding effect of the subjects. The solid line denotes the linear fit of the average data points (r = 0.93, r2 = 0.864). The results indicate that the crowding effect is strongest for a shorter presentation time of 30 ms and is reduced with increasing presentation time, showing that crowding is apparent at more than 240 ms and that it reaches a critical time for crowding (zero; dashed extrapolated line) at about 240 to 280 ms. However, it is possible that the critical time may reflect a ceiling effect at 240 ms, where the percentage correct is nearly 100. However, the average result is not exactly 240 ms, indicating that many subjects did not reach the ceiling effect. Some subjects reached the critical duration at a short time of 60 ms, indicating that this effect may be affected by the task's difficulty (Lev, Yehezkel, et al., 2014). Thus, we contend that our estimate of the critical duration is not due to a ceiling effect. 
Discussion
We used here a method similar to that of Flom et al. (1963), who reported a crowding effect up to about 4 arcmin. A short distance for crowding was also reported in other studies (Danilova & Bondarko, 2007), and it is assumed to be the upper limit of spacing for crowding. However, Flom et al. (1963) used an unlimited presentation time, Danilova and Bondarko (2007) used a 500-ms presentation time, and Levi and Carney (2011) used a 250-ms presentation time that was longer than the time it takes to escape from crowding, as found in our study (Figure 7b). Our results are consistent with our previous study (Lev, Yehezkel, et al., 2014) showing that foveal crowding can be revealed for short presentation times. The spacing between the letters (center to center) in our study is 10.08 arcmin for 0.4 letter spacing and 14.4 arcmin for one letter spacing, which is larger than the spacing assumed to have no crowding effect in the fovea. Thus, our data confirm our prediction that crowding, like masking, may have a temporal component that affects the results. This result is consistent with a recent study, in the periphery, showing that the range of crowding increases with decreasing presentation time (Tripathy et al., 2014). 
Experiment 5: Spatial masking using Gabor detection for short presentation times
Introduction
As shown above for the periphery, crowding is correlated with the size of the PF and is affected by temporal parameters. We have assumed that lateral interactions, in addition to spatial effects, are also affected by temporal parameters (Cass & Spehar, 2005; Huang & Hess, 2008; Polat & Sagi, 2007). Since we predicted that the fovea will exhibit spatiotemporal relationships similar to the periphery, here we explored the relationships between masking and crowding. 
At the fovea, the size of the PF is very small and there is a sharp transition between facilitation and suppression. Thus, using the standard paradigm of measuring the shift from facilitation to suppression poses a resolution limit that may decrease the accuracy. Therefore, here we used a different approach based on our basic model's assumptions that the PF size is dynamic and changes with a target's contrast and that the optimal size of the receptive field is found under collinear conditions for low-contrast targets when the E and I reach a steady state (Kasamatsu, Miller, Zhu, Chang, & Ishida, 2010). Thus, at the border of the PF, the lateral E balances the local I near the target's contrast threshold. Since the maximal effect of crowding (Figure 7) was at 30 ms, we explored the effect of lateral interactions for this short presentation time of 30 ms to explore the relationships between crowding and masking. 
Method
Contrast detection of a Gabor target was masked by two high-contrast (60%) collinear Gabor flankers. The target–flanker distance was 1.5λ, 2λ, 3λ, and 4λ (0.18°, 0.24°, 0.36°, and 0.48°) presented for 30 ms with a spatial frequency of eight cycles per degree (λ = 0.12°—the same as the letter size in Experiment 4). Thus, since we used σ = λ, the Gabor includes about two cycles (size = approximately 0.24°); therefore, a target–flanker separation of 3λ (0.36°) resulted in a spacing of 0.12° (no overlapping). Thus, the spacing in the Gabor detection that we used here is in the range of the interletter spacing distance under the crowding conditions measured in Experiment 4 for 0.4 and one interletter spacing. In the case of 1.5λ, since the flankers and the target overlap (see Figure 1b), the target–flanker spacing was 0.18° (10.08 arcmin)—the same as the 0.4 interletter spacing in Experiment 4 showing a crowding effect. In this case, the target's and the flankers' contrasts are combined. According to this account, the task is contrast discrimination and is considered as pattern masking. 
Here we wanted to measure the contrast threshold of the Gabor target; to this end, we used a 2AFC paradigm that is more suitable than yes–no for this purpose (see General method). The target's contrast detection threshold was determined for each condition using two intervals of 30 ms, with an 800-ms gap between them. Participants were asked to report which interval contained the target by pressing a mouse button (left for the first interval and right for the second). The same 13 subjects participated in this experiment. 
Results
As shown in Figure 8, there was a large suppressive effect at 1.5λ (0.52 log unit, p < 0.00005), which decreased rapidly to a lower and insignificant effect at 2λ (0.11 log unit, p = 0.24) and 3λ. The suppressive effect at 1.5λ is slightly larger than the typical amount of suppression reported earlier for the 90-ms presentation times (Polat & Sagi, 1993). In addition, note that the suppression at 1.5λ is significantly higher (p < 0.00005) than at 2λ despite the small difference of 0.5λ (3.6 arcmin) in the target–flanker separation between them. 
Figure 8
 
Suppression zone in the fovea for a short duration time of 30 ms. (a) Masking thresholds in log units (y axis), as measured for collinear configurations (see Figure 1) as a function of the target–flanker separation in λ units (x axis). The solid red line indicates the threshold of the target alone. The data points and the error bars are taken from single measures and are presented at each location for convenience and for comparison with the lateral interaction data. There is a large masking effect for short distances and no facilitation effect at 3λ and 4λ. The error bars denote the standard error of the mean (n = 13).
Figure 8
 
Suppression zone in the fovea for a short duration time of 30 ms. (a) Masking thresholds in log units (y axis), as measured for collinear configurations (see Figure 1) as a function of the target–flanker separation in λ units (x axis). The solid red line indicates the threshold of the target alone. The data points and the error bars are taken from single measures and are presented at each location for convenience and for comparison with the lateral interaction data. There is a large masking effect for short distances and no facilitation effect at 3λ and 4λ. The error bars denote the standard error of the mean (n = 13).
Discussion
As predicted by our hypotheses and assumptions, the data show that for short presentation times there is an increased suppressive effect and that the expected facilitation effect at 3λ and 4λ is not apparent. This result confirms our prediction that target–flanker separations below 2λ reveal the effects of I from inside the PF, whereas at 2λ to 3λ, at the border of the PF, they reflect an E/I level at which the lateral E balances the local I. This result is consistent with the relationships between I and E at the PF's border for short presentation times. This function of lateral interactions is reminiscent of the function found in amblyopia (Polat et al., 2004) and is consistent with our hypotheses and assumptions regarding the dynamics of lateral interactions. Thus, the results confirm our model's assumption that in lateral masking there is an initial phase of suppression due to fast (transient) local I. The fast suppression may explain the crowding effect that we found in the fovea for short presentation times. Interestingly, the sharp disappearance of the suppressive effect between 1.5λ and 2λ to 3λ is consistent with previous estimates of the PF's size, with 2λ to 3λ at the fovea. Thus, both crowding and masking may be affected by the same suppressive effects. We next explore the relationships between masking and crowding. 
Correlation between Gabor detection (masking) and letter identification (crowding)
Introduction
According to our hypotheses and assumptions, (a) masking and crowding are mediated by E and I and (b) contrast affects the E/I; hence, they affect the size of the PF (Cavanaugh, Bair, & Movshon, 2002a, 2002b; Fitzpatrick, 2000; Kapadia, Westheimer, & Gilbert, 1999; Kasamatsu et al., 2010; Sceniak, Ringach, Hawken, & Shapley, 1999). When the stimulus approaches the contrast detection threshold, the response may be evoked only from those neurons that are driven by the stimulus (Pelli, 1985; Stemmler, Usher, & Niebur, 1995). Thus, the optimal PF size may be found for a low-contrast target under a collinear configuration (Kasamatsu et al., 2010). 
As described in the hypotheses and assumptions, the size of a PF under conditions of lateral interactions near the contrast threshold may reveal the optimal PF size for exploring the relationships between masking and crowding. Therefore, we measured the contrast detection threshold under lateral masking conditions to estimate the optimal PF size. Thus, we explored the relationships among contrast detection threshold, letter identification (crowding), and the critical duration needed to overcome the crowding. We chose spatiotemporal parameters (2λ, 3λ, and 4λ) that were intended to measure the effects at the border or outside the PF (1.5λ as shown above is inside the PF) and a 0.4 interletter spacing that shows the effect of crowding. 
Method
We calculated the correlation among subjects between the contrast detection, under masking conditions, for target–flanker distances of 2λ, 3λ, and 4λ from Experiment 5. We also calculated the letter identification (crowding) for 0.4 interletter spacing for each presentation time (30, 60, and 120 ms) from Experiment 4. 
Results
The presentation time and target–flanker separation affected the correlation between masking and crowding. Figure 9 shows the correlation between the contrast detection and the letter identification effects for each presentation time (30, 60, and 120 ms) for target–flanker separations of 2λ, 3λ, and 4λ, respectively; the strongest correlation was always found for a target–flanker separation of 3λ. Note also that for each target–flanker separation the correlation was increased with increasing presentation time. The details of all correlations and the statistics are provided in the figure captions. 
Figure 9
 
Correlation between letter crowding and Gabor masking at different target flanker separations. Crowding effect (percentage of correct reduction) y axis for 30 (top), 60 (middle), and 120 (bottom) ms for a target–flanker separation of 2λ (left), 3λ (middle), and 4λ (left). The x axis denotes the masking threshold as a function of target–flanker distance. Each data point denotes data for one subject, and the solid line denotes the correlation fit of the dots. The correlation increases with increasing presentation time, but it is always highest for 3λ (n = 13).
Figure 9
 
Correlation between letter crowding and Gabor masking at different target flanker separations. Crowding effect (percentage of correct reduction) y axis for 30 (top), 60 (middle), and 120 (bottom) ms for a target–flanker separation of 2λ (left), 3λ (middle), and 4λ (left). The x axis denotes the masking threshold as a function of target–flanker distance. Each data point denotes data for one subject, and the solid line denotes the correlation fit of the dots. The correlation increases with increasing presentation time, but it is always highest for 3λ (n = 13).
The results showed that the correlation between contrast detection using Gabors and letter identification is stronger for longer presentation times. This result is consistent with our hypotheses and assumptions that the E/I is not fixed, suggesting that subjects having a low contrast threshold also have less of an inhibitory effect, thus enabling the slower effect of collinear facilitation to be more effective (Cass & Alais, 2006; Polat & Sagi, 2006; Sterkin & Polat, 2008; Sterkin, Yehezkel, Bonneh, Norcia, & Polat, 2009). In addition, we noted that the best correlation for each presentation time was found at 3λ, a target–mask separation possibly reflecting the E/I balance at the PF's border at the fovea. Moreover, the results support our hypotheses and assumptions that the best correlation between masking and crowding might be attributed to estimating the PF's border under lateral masking conditions. Consistent with this assumption is our finding that the correlation from within the PF (1.5λ, overlapping masking) with the crowding is small and insignificant (30 ms: r = 0.4, p = 0.15; 60 ms: r = 0.5, p = 0.08; 120 ms: r = 0.35, p = 0.23), whereas the results for 2λ are slightly better and barely significant, and only for the longest duration of 120 ms. In contrast, for 3λ, the correlation is significant for all durations and is strongest for 60 and 120 ms. Thus, consistent with our model's predictions, the correlation may reflect the E/I balance at the border of the PF and, consequently, the optimal size of the RF under masking and crowding conditions. Moreover, the E/I balance was reached at a spatial separation of 3λ and for a temporal balance between the time constant of decaying I and the time constant of increasing E. Thus, the correlation at 3λ may reveal that the spatial and temporal window that the masking and crowding processing reach is a balance, and hence it reflects the strongest level of correlation. 
Spatial and temporal relationships between contrast detection (masking) and letter identification (crowding)
Introduction
Based on our hypotheses and assumptions, we chose spatiotemporal parameters that were assumed to be optimal for I (a short presentation time) and for measuring the contrast threshold under conditions of lateral interactions to explore the relationships between masking and crowding in the fovea. 
Method
We used the contrast detection threshold that was found in Experiment 5 for a presentation time of 30 ms. We also used the calculated critical time for crowding (no-crowding effect, Experiment 4; Figure 7b) by extrapolating the curve to the zero crowding level for each subject (as shown for the average group in Figure 7b). For each subject, we calculated the reduction in the percentage correct (delta) between the crowded and noncrowded conditions for each presentation time (30, 60, 120, and 240 ms). Then we calculated the first presentation time at which the fitted linear slope reached the no-crowding effect (zero). 
Results
There was variability among subjects regarding this critical time (as can be seen from the error bars in Figure 7b). There was also variability in the contrast detection threshold among subjects. We calculated, for each subject, the correlation between the critical time for letter identification under the crowding condition and target detection under masking at 2λ, 3λ, and 4λ for a presentation time of 30 ms. 
As Figure 10 shows, there was a very strong correlation—F(1, 11) = 162.0, r = 0.97, p < 0.00005—between the critical time to escape crowding and the threshold at 3λ. A weaker correlation, however, was found at 2λF(1, 11) = 9.7, r = 0.68, p = 0.0098—and at 4λF(1, 11) = 21.43, r = 0.81, p = 0.0007. For 1.5λ the correlation is insignificant; F(1, 11) = 2.77, r = 0.44, p = 0.12. Subjects who had lower contrast detection thresholds under Gabor masking conditions escaped from crowding (letter identification) at shorter times, but subjects with higher contrast detection thresholds escaped from crowding (letter identification) only after longer times. Thus, these results support our model that masking and crowding are related when certain parameters are used in the space and time domains but that the correlation may be weaker with other combinations of parameters, such as target–flanker separations and presentation times. 
Figure 10
 
The correlation between the critical time to escape from crowding and masking. For each subject the critical duration in milliseconds to escape from crowding (no crowding, y axis) is presented against masking thresholds (log units) for different target–flanker distances: (a) 2λ, (b) 3λ, and (c) 4λ. Each data point denotes data for one subject. A very high correlation was found for 3λ (n = 13).
Figure 10
 
The correlation between the critical time to escape from crowding and masking. For each subject the critical duration in milliseconds to escape from crowding (no crowding, y axis) is presented against masking thresholds (log units) for different target–flanker distances: (a) 2λ, (b) 3λ, and (c) 4λ. Each data point denotes data for one subject. A very high correlation was found for 3λ (n = 13).
Discussion
We found here for the fovea, consistent with our previous study (Lev, Yehezkel, et al., 2014), that crowding is most pronounced for a presentation time of 30 ms and is reduced for longer presentation times and that crowding is found only for short target–flanker distances. As we showed in Figure 8 for a presentation time of 30 ms, the border of suppression is at 2λ to 3λ (the critical distance for suppression). Thus, the strongest correlation between the critical time for crowding and the critical distance for masking at 3λ may reflect the correlation between spatiotemporal processing at the border of the PF, which affects both masking and crowding. This result is consistent with our model's prediction that the E/I balance is reached at a spatial separation of 3λ for a short temporal time of 30 ms or more. 
General discussion
Our results confirm our hypotheses and assumptions that masking and crowding should be related to the PF's size, at the fovea and periphery, under a specific range of spatiotemporal parameters. Our results, in agreement with and in addition to many previous studies (Levi, 2008; Whitney & Levi, 2011), indicate that multidimensional parameters and multiple factors may affect the experimental outcome. Thus, our results indicate that crowding and masking could be related or not, depending on the particular spatiotemporal parameters chosen in the study. Next, we discuss how to characterize the spatiotemporal domain, which may determine the relationships between masking and crowding. 
PF size as an essential factor
A major prediction in our model is that masking and crowding effects are related to the PF's size. Our results indicate that the spatial range of both masking and crowding is increased at the periphery in correlation with the increase in PF size. At the periphery, our estimate of the PF's size is larger than that at the fovea by about a factor of two, and is about 5λ, which is consistent with our previous estimate of the PF (Lev & Polat, 2011). At the fovea we found that the border of the masking effect (2λ–3λ) is consistent with the size of the PF at the fovea (Neri & Levi, 2006; Polat & Sagi, 1993; Polat & Tyler, 1999; Watson, 1982; Watson et al., 1983). Thus, our results are consistent with our suggestion that crowding and masking, under specific spatiotemporal parameters, can be viewed as pattern masking (ordinary masking) even without any apparent overlapping between the target and flankers. 
Comparison between the fovea and the periphery
In both the fovea and the periphery, the best and most significant correlations between masking and crowding are found at the border of the PF. In the fovea it was found for 3λ, whereas at the periphery it was found for 5λ. In both cases, there is interaction between space and time. Thus, this result is consistent with our hypotheses and assumptions that a larger PF size should result in the strongest masking and crowding and vice versa; hence, an estimate of the PF's size is essential for understanding the relationships between masking and crowding. Moreover, the best correlation was found for certain target–flanker separations (spatial gaps) and certain temporal windows that can be seen as a temporal gap between the onset of the target processing and when it reaches the E/I balance. This effect is seen by increasing the presentation time that is required for the E to take place or by a certain ISI that produces a temporal gap between the target and the crowded letters. Thus, for both the fovea and the periphery, we show that similar characteristics exist for masking and crowding in space and time. 
How masking and crowding may be related
Models of spatial vision suggest two stages: feature detection and identification. In this regard, masking may be viewed as first-order processing at threshold and crowding, and as an identification task it may be regarded as second-order processing at suprathreshold, which may rely on the output of the first order. Thus, the output of the detection process may serve as input for the mechanism that underlies letter identification. It is also suggested that crowding may be regarded as grouping (Chakravarthi & Pelli, 2011; Herzog & Fahle, 2002; Malania, Herzog, & Westheimer, 2007; Manassi et al., 2012) and may be determined by multiple sources of processing that may be operating at several levels of representation (Levi, Hariharan, et al., 2002; Livne & Sagi, 2011). 
Grouping is suggested as a two-phase process: base grouping and incremental grouping (Roelfsema, 2006). Base groupings are coded by single neurons tuned to multiple features and are computed rapidly because they reflect the selectivity of feed-forward connections. A second phase, incremental grouping, enhances the responses of neurons' coding features, but it takes more time than does base grouping because it also relies on horizontal and feedback connections. This view is consistent with our previous data (Polat & Sagi, 2006; Sterkin & Polat, 2008) suggesting that the process of grouping consists of two phases: The first one is suppressive, fast, and transient and the second one is facilitative, delayed, slower, and sustained. According to this view, the first phase of grouping, which is dominated by I, acts to reduce the activity in the network (suppression). Then the second phase of grouping, dominated by lateral E, provides segregation of contours from the background and makes them salient for further processing (Polat & Sagi, 2006; Sterkin & Polat, 2008). Indeed, physiological studies show that the latter phase provides a segregation of figures from the ground (Lamme, 1995; Roelfsema, 2006; Roelfsema, Tolboom, & Khayat, 2007). Thus, crowding may be related to the first phase of grouping and may be considered as the suppressive phase. We showed here and recently (Lev, Yehezkel, et al., 2014) that the suppressive effect (crowding) is transient but that longer presentation times enable correct identification of the target. Thus, this second part can be viewed as the second phase of grouping, which enables figure–ground segregation (Neri & Levi, 2007; Polat & Sagi, 2006; Sterkin & Polat, 2008) and, hence, correct identification of the target from the background of flankers (uncrowding). We sought a correlation between crowding and masking and found the maximal correlation for conditions where E/I reaches an optimal level between lateral E and local I (e.g., at 3λ and 60 and 120 ms; see Figure 9). This most likely indicates the transition point from the first phase (inhibitory) to the second phase (excitatory). 
Dynamics of masking and crowding (temporal domain)
In the spatial domain, both masking and crowding are affected by target–flanker separations. In the temporal domain, masking (Figure 8) and crowding (Figure 7) are evident for short presentation times and are strongly reduced for longer presentation times. If the second processing stage is delayed after the processing of the first stage has been completed (Neri & Heeger, 2002), one can predict that crowding may be found after longer times than masking can. However, the calculated critical duration for overcoming crowding at the fovea can reach about 300 ms for some subjects (see Figure 10). 
These long-duration effects are consistent with our assumption that the inhibitory effect is reduced by E, which is delayed. Thus, the critical time needed to overcome crowding depends on the dynamics of E and I and on the time that it takes to reach E/I at an optimal level. Since this E/I level depends on the stimulus parameters and on the network, it may be reached with different spatiotemporal parameters for each subject, resulting in a large temporal window for effective crowding. Therefore, whether crowding is a second stage that is delayed and relies on the output of the masking processing (i.e., a parallel or continuous process), as our model suggests, remains an open question. 
Table 1.
 
Statistics for the correlation between letter crowding and Gabor masking (Figure 9).
Table 1.
 
Statistics for the correlation between letter crowding and Gabor masking (Figure 9).
Acknowledgments
This work was performed in partial fulfillment of the requirements for a doctoral degree by Maria Lev at the Sackler Faculty of Medicine, Tel Aviv University, Israel. We thank Yoram Bonneh, Dov Sagi, and Dennis Levi for their helpful comments. This study was supported by grants from the Israel Science Foundation (ISF188/2010). 
Commercial relationships: none. 
Corresponding author: Uri Polat. 
Address: Goldschleger Eye Research Institute, the Sackler Faculty of Medicine, Tel-Aviv University, Tel-Hashomer, Israel. 
References
Adini Y., Sagi D. (2001). Recurrent networks in human visual cortex: Psychophysical evidence. Journal of the Optical Society of America A: Optics, Image Science, and Vision, 18 (9), 2228–2236.
Adini Y., Sagi D., Tsodyks M. (1997). Excitatory-inhibitory network in the visual cortex: Psychophysical evidence. Proceedings of the National Academy of Sciences, USA, 94 (19), 10426–10431.
Albrecht D. G., Hamilton D. B. (1982). Striate cortex of monkey and cat: Contrast response function. Journal of Neurophysiology, 48 (1), 217–237.
Amiaz R., Zomet A., Polat U. (2011). Excitatory repetitive transcranial magnetic stimulation over the dorsolateral prefrontal cortex does not affect perceptual filling-in in healthy volunteers. Vision Research, 51 (18), 2071–2076, doi:10.1016/j.visres.2011.08.003.
Bair W., Cavanaugh J. R., Movshon J. A. (2003). Time course and time-distance relationships for surround suppression in macaque V1 neurons. Journal of Neuroscience, 23 (20), 7690–7701.
Bolz J., Gilbert C. D. (1989). The role of horizontal connections in generating long receptive fields in the cat visual cortex. European Journal of Neuroscience , 1 (3), 263–268.
Bonneh Y., Sagi D. (1998). Effects of spatial configuration on contrast detection. Vision Research, 38 (22), 3541–3553.
Bonneh Y. S., Sagi D., Polat U. (2007). Spatial and temporal crowding in amblyopia. Vision Research, 47 (14), 1950–1962.
Bouma H. (1970). Interaction effects in parafoveal letter recognition. Nature , 226 (241), 177–178.
Breitmeyer B. G. (1984). Visual masking: An integrative approach (Vol. 4). New York, NY: Oxford University Press.
Breitmeyer B. G., Ogmen H. (2000). Recent models and findings in visual backward masking: A comparison, review, and update. Perception and Psychophysics, 62 (8), 1572–1595.
Cannon M. W., Fullenkamp S. C. (1991). Spatial interactions in apparent contrast: Inhibitory effects among grating patterns of different spatial frequencies, spatial positions and orientations. Vision Research , 31 (11), 1985–1998.
Cass J., Alais D. (2006). The mechanisms of collinear integration. Journal of Vision , 6 (9): 5, 915–922, doi:10.1167/6.9.5. [PubMed] [Article]
Cass J. R., Spehar B. (2005). Dynamics of collinear contrast facilitation are consistent with long-range horizontal striate transmission. Vision Research, 45 (21), 2728–2739.
Cavanaugh J. R., Bair W., Movshon J. A. (2002a). Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. Journal of Neurophysiology, 88 (5), 2530–2546, doi:10.1152/jn.00692.2001.
Cavanaugh J. R., Bair W., Movshon J. A. (2002b). Selectivity and spatial distribution of signals from the receptive field surround in macaque V1 neurons. Journal of Neurophysiology, 88 (5), 2547–2556, doi:10.1152/jn.00693.2001.
Chakravarthi R., Cavanagh P. (2009). Recovery of a crowded object by masking the flankers: Determining the locus of feature integration. Journal of Vision, 9 (10): 4, 1–9, doi:10.1167/9.10.4. [PubMed] [Article]
Chakravarthi R., Pelli D. G. (2011). The same binding in contour integration and crowding. Journal of Vision , 11 (8): 10, 1–12, doi:10.1167/11.8.10. [PubMed] [Article]
Chen C. C., Kasamatsu T., Polat U., Norcia A. M. (2001). Contrast response characteristics of long-range lateral interactions in cat striate cortex. Neuroreport , 12 (4), 655–661.
Chen C. C., Tyler C. W. (2002). Lateral modulation of contrast discrimination: Flanker orientation effects. Journal of Vision, 2 (6): 8, 520–530, doi:10.1167/2.6.8. [PubMed] [Article]
Chen C. C., Tyler C. W. (2008). Excitatory and inhibitory interaction fields of flankers revealed by contrast-masking functions. Journal of Vision , 8 (4): 10, 1–14, doi:10.1167/8.4.10. [PubMed] [Article]
Chubb C., Sperling G., Solomon J. A. (1989). Texture interactions determine perceived contrast. Proceedings of the National Academy of Sciences, USA, 86 (23), 9631–9635.
Chung S., Patel S. (2011). Temporal dynamics of the crowding mechanism. Journal of Vision, 11 (11): 1143, doi:10.1167/11.11.1143. [Abstract]
Chung S. T., Levi D. M., Legge G. E. (2001). Spatial-frequency and contrast properties of crowding. Vision Research, 41 (14), 1833–1850.
Chung S. T., Mansfield J. S. (2009). Contrast polarity differences reduce crowding but do not benefit reading performance in peripheral vision. Vision Research, 49 (23), 2782–2789, doi:10.1016/j.visres.2009.08.013.
Coates D. R., Chin J. M., Chung S. T. (2013). Factors affecting crowded acuity: Eccentricity and contrast. Optometry and Vision Science, 90 (7), 628–638, doi:10.1097/OPX.0b013e31829908a4.
Coates D. R., Levi D. M. (2014). Contour interaction in foveal vision: A response to Siderov, Waugh, and Bedell (2013). Vision Research, 96, 140–144, doi:10.1016/j.visres.2013.10.016.
Daniel P. M., Whitteridge D. (1961). The representation of the visual field on the cerebral cortex in monkeys. Journal of Physiology, 159, 203–221.
Danilova M. V., Bondarko V. M. (2007). Foveal contour interactions and crowding effects at the resolution limit of the visual system. Journal of Vision , 7 (2): 25, 1–18, doi:10.1167/7.2.25. [PubMed] [Article]
Dow B., Snyder A. Z., Vautin R. G., Bauer R. (1981). Magnification factor and receptive field size in foveal striate cortex of the monkey. Experimental Brain Research, 44 (2), 213–228.
Duncan R. O., Boynton G. M. (2003). Cortical magnification within human primary visual cortex correlates with acuity thresholds. Neuron , 38 (4), 659–671.
Enns J. T., Di Lollo V. (2000). What's new in visual masking? Trends in Cognitive Science, 4 (9), 345–352.
Fitzpatrick D. (2000). Seeing beyond the receptive field in primary visual cortex. Current Opinion in Neurobiology, 10 (4), 438–443.
Flom M. C., Weymouth F. W., Kahneman D. (1963). Visual resolution and contour interaction. Journal of the Optical Society of America , 53 (9), 1026–1032.
Foley J. M., Varadharajan S., Koh C. C., Farias M. C. (2007). Detection of Gabor patterns of different sizes, shapes, phases and eccentricities. Vision Research, 47 (1), 85–107, doi:10.1016/j.visres.2006.09.005.
Francis G. (2000). Quantitative theories of metacontrast masking. Psychological Review, 107 (4), 768–785.
Freeman E., Sagi D., Driver J. (2001). Lateral interactions between targets and flankers in low-level vision depend on attention to the flankers. Nature Neuroscience, 4 (10), 1032–1036.
Gilbert C. D., Li W. (2013). Top-down influences on visual processing. Nature Reviews Neuroscience, 14 (5), 350–363, doi:10.1038/nrn3476.
Gilbert C. D., Wiesel T. N. (1989). Columnar specificity of intrinsic horizontal and corticocortical connections in cat visual cortex. Journal of Neuroscience , 9 (7), 2432–2442.
Giorgi R. G., Soong G. P., Woods R. L., Peli E. (2004). Facilitation of contrast detection in near-peripheral vision. Vision Research, 44 (27), 3193–3202.
Greenwood J. A., Sayim B., Cavanagh P. (2014). Crowding is reduced by onset transients in the target object (but not in the flankers). Journal of Vision , 14 (6): 2, 1–21, doi:10.1167/14.6.2. [PubMed] [Article]
Grinvald A., Lieke E. E., Frostig R. D., Hildesheim R. (1994). Cortical point-spread function and long-range lateral interactions revealed by real-time optical imaging of macaque monkey primary visual cortex. Journal of Neuroscience, 14 (5 Pt. 1), 2545–2568.
He S., Cavanagh P., Intriligator J. (1996). Attentional resolution and the locus of visual awareness. Nature , 383 (6598), 334–337.
Herzog M. H., Fahle M. (2002). Effects of grouping in contextual modulation. Nature , 415 (6870), 433–436.
Hirsch J. A., Gilbert C. D. (1991). Synaptic physiology of horizontal connections in the cat's visual cortex. Journal of Neuroscience , 11 (6), 1800–1809.
Huang P. C., Hess R. F. (2008). The dynamics of collinear facilitation: Fast but sustained. Vision Research, 48 (27), 2715–2722, doi:10.1016/j.visres.2008.09.013.
Jung R., Spillmann P. (1970). Receptive-field estimation and perceptual integration in human vision. In Young F. A. Lindsley D. B. (Eds.) Early experience and visual information processing in perceptual and reading disorders (pp. 181–197). Washington, DC: National Academy of Sciences Proceedings.
Kapadia, M. K., Westheimer G., Gilbert C. D. (1999). Dynamics of spatial summation in primary visual cortex of alert monkeys. Proceedings of the National Academy of Sciences, USA, 96 (21), 12073–12078.
Kapadia M. K., Westheimer G., Gilbert C. D. (2000). Spatial distribution of contextual interactions in primary visual cortex and in visual perception. Journal of Neurophysiology, 84 (4), 2048–2062.
Kasamatsu T., Miller R., Zhu Z., Chang M., Ishida Y. (2010). Collinear facilitation is independent of receptive-field expansion at low contrast. Experimental Brain Research, 201 (3), 453–465, doi:10.1007/s00221-009-2057-1.
Kisvarday Z. F., Toth E., Rausch M., Eysel U. T. (1997). Orientation-specific relationship between populations of excitatory and inhibitory lateral connections in the visual cortex of the cat. Cerebral Cortex, 7 (7), 605–618.
Knierim J. J., van Essen D. C. (1992). Neuronal responses to static texture patterns in area V1 of the alert macaque monkey. Journal of Neurophysiology , 67 (4), 961–980.
Kooi F. L., Toet A., Tripathy S. P., Levi D. M. (1994). The effect of similarity and duration on spatial interaction in peripheral vision. Spatial Vision , 8 (2), 255–279.
Kovacs I. (1996). Gestalten of today: Early processing of visual contours and surfaces. Behavioural Brain Research, 82 (1), 1–11.
Lamme V. A. (1995). The neurophysiology of figure-ground segregation in primary visual cortex. Journal of Neuroscience , 15 (2), 1605–1615.
Lev M., Gilaie-Dotan S., Gotthilf-Nezri D., Yehezkel O., Brooks J. L., Perry A., Polat U. (2015). Training-induced recovery of low-level vision followed by mid-level perceptual improvements in developmental object and face agnosia. Developmental Science, 18, 50–64, doi:10.1111/desc.12178.
Lev M., Ludwig K., Gilaie-Dotan S., Voss S., Sterzer P., Hesselmann G., Polat U. (2014). Training improves visual processing speed and generalizes to untrained functions. Scientific Reports, 4, 7251, doi:10.1038/srep07251.
Lev M., Polat U. (2011). Collinear facilitation and suppression at the periphery. Vision Research, 51 (23–24), 2488–2498, doi:10.1016/j.visres.2011.10.008.
Lev M., Yehezkel O., Polat U. (2014). Uncovering foveal crowding? Scientific Reports, 4, 4067, doi:10.1038/srep04067.
Levi D. M. (2008). Crowding—An essential bottleneck for object recognition: A mini-review. Vision Research, 48 (5), 635–654, doi:10.1016/j.visres.2007.12.009.
Levi D. M., Carney T. (2011). The effect of flankers on three tasks in central, peripheral, and amblyopic vision. Journal of Vision , 11 (1): 10, 1–23, doi:10.1167/11.1.10. [PubMed] [Article]
Levi D. M., Hariharan S., Klein S. A. (2002). Suppressive and facilitatory spatial interactions in peripheral vision: Peripheral crowding is neither size invariant nor simple contrast masking. Journal of Vision, 2 (2): 3, 167–177, doi:10.1167/2.2.3. [PubMed] [Article]
Levi D. M., Klein S. A. (1985). Vernier acuity, crowding and amblyopia. Vision Research , 25 (7), 979–991.
Levi D. M., Klein S. A., Aitsebaomo A. P. (1985). Vernier acuity, crowding and cortical magnification. Vision Research , 25 (7), 963–977.
Levi D. M., Klein S. A., Hariharan S. (2002). Suppressive and facilitatory spatial interactions in foveal vision: Foveal crowding is simple contrast masking. Journal of Vision, 2 (2): 2, 140–166, doi:10.1167/2.2.2. [PubMed] [Article]
Levitt J. B., Lund J. S. (1997). Contrast dependence of contextual effects in primate visual cortex. Nature , 387 (6628), 73–76.
Levitt J. B., Lund J. S. (2002). The spatial extent over which neurons in macaque striate cortex pool visual signals. Visual Neuroscience, 19 (4), 439–452.
Livne T., Sagi D. (2011). Multiple levels of orientation anisotropy in crowding with Gabor flankers. Journal of Vision , 11 (13): 18, 1–10, doi:10.1167/11.13.18. [PubMed] [Article]
Malania M., Herzog M. H., Westheimer G. (2007). Grouping of contextual elements that affect vernier thresholds. Journal of Vision , 7 (2): 1, 1–7, doi:10.1167/7.2.1. [PubMed] [Article]
Manassi M., Sayim B., Herzog M. H. (2012). Grouping, pooling, and when bigger is better in visual crowding. Journal of Vision , 12 (10): 13, 1–14, doi:10.1167/12.10.13. [PubMed] [Article]
Manassi M., Sayim B., Herzog M. H. (2013). When crowding of crowding leads to uncrowding. Journal of Vision , 13 (13): 10, 1–10, doi:10.1167/13.13.10. [PubMed] [Article]
Maniglia M., Pavan A., Cuturi L. F., Campana G., Sato G., Casco C. (2011). Reducing crowding by weakening inhibitory lateral interactions in the periphery with perceptual learning. PLoS One , 6 (10), e25568, doi:10.1371/journal.pone.0025568.
Maniglia M., Pavan A., Trotter Y. (2015). The effect of spatial frequency on peripheral collinear facilitation. Vision Research, 107, 146–154, doi:10.1016/j.visres.2014.12.008.
Marcelja S. (1980). Mathematical description of the responses of simple cortical cells. Journal of the Optical Society of America , 70 (11), 1297–1300.
Meirovithz E., Ayzenshtat I., Bonneh Y. S., Itzhack R., Werner-Reiss U., Slovin H. (2010). Population response to contextual influences in the primary visual cortex. Cerebral Cortex, 20 (6), 1293–1304, doi:10.1093/cercor/bhp191.
Mizobe K., Polat U., Pettet M. W., Kasamatsu T. (2001). Facilitation and suppression of single striate-cell activity by spatially discrete pattern stimuli presented beyond the receptive field. Visual Neuroscience, 18 (3), 377–391.
Neri P., Heeger D. J. (2002). Spatiotemporal mechanisms for detecting and identifying image features in human vision. Nature Neuroscience, 5 (8), 812–816, doi:10.1038/nn886.
Neri P., Levi D. M. (2006). Receptive versus perceptive fields from the reverse-correlation viewpoint. Vision Research, 46 (16), 2465–2474, doi:10.1016/j.visres.2006.02.002.
Neri P., Levi D. M. (2007). Temporal dynamics of figure-ground segregation in human vision. Journal of Neurophysiology, 97 (1), 951–957, doi:10.1152/jn.00753.2006.
Norcia A. M., Tyler C. W. (1985). Spatial frequency sweep VEP: Visual acuity during the first year of life. Vision Research, 25 (10), 1399–1408.
Norcia A. M., Tyler C. W., Hamer R. D., Wesemann W. (1989). Measurement of spatial contrast sensitivity with the swept contrast VEP. Vision Research , 29 (5), 627–637.
Parkes L., Lund J., Angelucci A., Solomon J. A., Morgan M. (2001). Compulsory averaging of crowded orientation signals in human vision. Nature Neuroscience, 4 (7), 739–744.
Pelli D. G. (1985). Uncertainty explains many aspects of visual contrast detection and discrimination. Journal of the Optical Society of America, 2 (9), 1508–1532.
Pelli D. G., Palomares M., Majaj N. J. (2004). Crowding is unlike ordinary masking: Distinguishing feature integration from detection. Journal of Vision, 4 (12): 12, 1136–1169, doi:10.1167/4.12.12. [PubMed] [Article]
Pelli D. G., Tillman K. A. (2008). The uncrowded window of object recognition. Nature Neuroscience, 11 (10), 1129–1135.
Polat U. (1999). Functional architecture of long-range perceptual interactions. Spatial Vision, 12 (2), 143–162.
Polat U. (2009a). Effect of spatial frequency on collinear facilitation. Spatial Vision, 22 (2), 179–193, doi:10.1163/156856809787465609.
Polat U. (2009b). Making perceptual learning practical to improve visual functions. Vision Research, 49 (21), 2566–2573, doi:10.1016/j.visres.2009.06.005.
Polat U., Ma-Naim T., Belkin M., Sagi D. (2004). Improving vision in adult amblyopia by perceptual learning. Proceedings of the National Academy of Sciences, USA, 101 (17), 6692–6697.
Polat U., Ma-Naim T., Spierer A. (2009). Treatment of children with amblyopia by perceptual learning. Vision Research, 49 (21), 2599–2603, doi:10.1016/j.visres.2009.07.008.
Polat U., Mizobe K., Pettet M. W., Kasamatsu T., Norcia A. M. (1998). Collinear stimuli regulate visual responses depending on cell's contrast threshold. Nature , 391 (6667), 580–584.
Polat U., Norcia A. M. (1996). Neurophysiological evidence for contrast dependent long-range facilitation and suppression in the human visual cortex. Vision Research, 36 (14), 2099–2109.
Polat U., Norcia A. M. (1998). Elongated physiological summation pools in the human visual cortex. Vision Research, 38 (23), 3735–3741.
Polat U., Sagi D. (1993). Lateral interactions between spatial channels: Suppression and facilitation revealed by lateral masking experiments. Vision Research, 33 (7), 993–999.
Polat U., Sagi D. (1994a). The architecture of perceptual spatial interactions. Vision Research, 34 (1), 73–78.
Polat U., Sagi D. (1994b). Spatial interactions in human vision: From near to far via experience-dependent cascades of connections. Proceedings of the National Academy of Sciences, USA, 91 (4), 1206–1209.
Polat U., Sagi D. (2006). Temporal asymmetry of collinear lateral interactions. Vision Research, 46 (6–7), 953–960.
Polat U., Sagi D. (2007). The relationship between the subjective and objective aspects of visual filling-in. Vision Research, 47 (18), 2473–2481, doi:10.1016/j.visres.2007.06.007.
Polat U., Schor C., Tong J. L., Zomet A., Lev M., Yehezkel O., Levi D. M. (2012). Training the brain to overcome the effect of aging on the human eye. Scientific Reports, 2, 278, doi:10.1038/srep00278.
Polat U., Sterkin A., Yehezkel O. (2007). Spatio-temporal low-level neural networks account for visual masking. Advances in Cognitive Psychology, 3 (1–2), 153–165, doi:10.2478/v10053-008-0021-4.
Polat U., Tyler C. W. (1999). What pattern the eye sees best. Vision Research, 39 (5), 887–895.
Rashal E., Yeshurun Y. (2014). Contrast dissimilarity effects on crowding are not simply another case of target saliency. Journal of Vision , 14 (6): 9, 1–12, doi:10.1167/14.6.9. [PubMed] [Article]
Roelfsema P. R. (2006). Cortical algorithms for perceptual grouping. Annual Review of Neuroscience , 29 , 203–227.
Roelfsema P. R., Tolboom M., Khayat P. S. (2007). Different processing phases for features, figures, and selective attention in the primary visual cortex. Neuron , 56 (5), 785–792.
Saarela T. P., Herzog M. H. (2008). Time-course and surround modulation of contrast masking in human vision. Journal of Vision , 8 (3): 23, 1–10, doi:10.1167/8.3.23. [PubMed] [Article]
Sceniak M. P., Ringach D. L., Hawken M. J., Shapley R. (1999). Contrast's effect on spatial summation by macaque V1 neurons. Nature Neuroscience, 2 (8), 733–739, doi:10.1038/11197.
Shani R., Sagi D. (2005). Eccentricity effects on lateral interactions. Vision Research, 45 (15), 2009–2024.
Siderov J., Waugh S. J., Bedell H. E. (2013). Foveal contour interaction for low contrast acuity targets. Vision Research, 77, 10–13, doi:10.1016/j.visres.2012.11.008.
Simmers A. J., Gray L. S., McGraw P. V., Winn B. (1999). Contour interaction for high and low contrast optotypes in normal and amblyopic observers. Ophthalmic and Physiological Optics, 19 (3), 253–260.
Solomon J. A., Morgan M. J. (2000). Facilitation from collinear flanks is cancelled by non-collinear flanks. Vision Research, 40 (3), 279–286.
Stemmler M., Usher M., Niebur E. (1995). Lateral interactions in primary visual cortex: A model bridging physiology and psychophysics. Science, 269 (5232), 1877–1880.
Sterkin A., Polat U. (2008). Response similarity as a basis for perceptual binding. Journal of Vision , 8 (7): 17, 1–12, doi:10.1167/8.7.17. [PubMed] [Article]
Sterkin A., Yehezkel O., Bonneh Y. S., Norcia A., Polat U. (2009). Backward masking suppresses collinear facilitation in the visual cortex. Vision Research, 49 (14), 1784–1794, doi:10.1016/j.visres.2009.04.013.
Stettler D. D., Das A., Bennett J., Gilbert C. D. (2002). Lateral connectivity and contextual interactions in macaque primary visual cortex. Neuron , 36 (4), 739–750.
Strasburger H., Malania M. (2013). Source confusion is a major cause of crowding. Journal of Vision , 13 (1): 24, 1–20, doi:10.1167/13.1.24. [PubMed] [Article]
Tripathy S. P., Cavanagh P., Bedell H. E. (2014). Large crowding zones in peripheral vision for briefly presented stimuli. Journal of Vision, 14 (6): 11, 1–11, doi:10.1167/14.6.11. [PubMed] [Article]
Tyler C. W., Apkarian P., Levi D. M., Nakayama K. (1979). Rapid assessment of visual function: An electronic sweep technique for the pattern visual evoked potential. Investigative Ophthalmology & Visual Science, 18 (7), 703–713. [PubMed] [Article]
Watson A. B. (1982). Summation of grating patches indicates many types of detector at one retinal location. Vision Research , 22 (1), 17–25.
Watson A. B., Barlow H. B., Robson J. G. (1983). What does the eye see best? Nature , 302 (5907), 419–422.
Weliky M., Kandler K., Fitzpatrick D., Katz L. C. (1995). Patterns of excitation and inhibition evoked by horizontal connections in visual cortex share a common relationship to orientation columns. Neuron, 15 (3), 541–552.
Westheimer G., Hauske G. (1975). Temporal and spatial interference with vernier acuity. Vision Research, 15, 1137–1141.
Whitney D., Levi D. M. (2011). Visual crowding: A fundamental limit on conscious perception and object recognition. Trends in Cognitive Science, 15 (4), 160–168, doi:10.1016/j.tics.2011.02.005.
Williams C. B., Hess R. F. (1998). Relationship between facilitation at threshold and suprathreshold contour integration. Journal of the Optical Society of America A: Optics, Image Science, and Vision, 15 (8), 2046–2051.
Woods R. L., Nugent A. K., Peli E. (2002). Lateral interactions: Size does matter. Vision Research, 42 (6), 733–745.
Yehezkel O., Sterkin A., Lev M., Polat U. (2015). Training on spatiotemporal masking improves crowded and uncrowded visual acuity. Journal of Vision, 15 (6): 12, 1–18, doi:10.1167/15.6.12. [PubMed] [Article].
Yeshurun Y., Rashal E. (2010). Precueing attention to the target location diminishes crowding and reduces the critical distance. Journal of Vision , 10 (10): 16, 1–12, doi:10.1167/10.10.16. [PubMed] [Article]
Yu C., Klein S. A., Levi D. M. (2003). Perceptual learning in contrast discrimination and the (minimal) role of context. Journal of Vision, 4 (3): 4, 169–182, doi:10.1167/4.3.4. [PubMed] [Article]
Yu C., Levi D. M. (2000). Surround modulation in human vision unmasked by masking experiments. Nature Neuroscience, 3 (7), 724–728.
Zenger B., Sagi D. (1996). Isolating excitatory and inhibitory nonlinear spatial interactions involved in contrast detection. Vision Research , 36 (16), 2497–2513.
Zenger-Landolt B., Koch C. (2001). Flanker effects in peripheral contrast discrimination—Psychophysics and modeling. Vision Research, 41 (27), 3663–3675.
Zomet A., Amiaz R., Grunhaus L., Polat U. (2008). Major depression affects perceptual filling-in. Biological Psychiatry, 64, 667–671.
Zomet A., Amiaz R., Polat U. (2007). Early perceptual loss in depression. Neural Plasticity, 1, 121.
Appendix
Background leading to the hypothesis and assumptions
Lateral masking
Masking experiments using Gabor patches display the effects of both suppression (masking) and facilitation. The collinear facilitation local target is found when the target is presented simultaneously with high-contrast collinear flankers (Adini & Sagi, 2001; Adini, Sagi, & Tsodyks, 1997; Bonneh & Sagi, 1998; Cass & Alais, 2006; Lev & Polat, 2011; Levi & Carney, 2011; Levi, Hariharan et al., 2002; Levi, Klein et al., 2002; Polat & Sagi, 1993, 1994a; Solomon & Morgan, 2000; Woods et al., 2002), and it depends on the target–flanker distance. For short distances of 0λ to 2λ, the target detection threshold is increased (masking, suppression; Polat & Sagi, 1993; Zenger & Sagi, 1996). For larger target–flanker distances, however, facilitation is usually maximal at a distance of 3λ—a distance where the target and flankers do not overlap (Polat, 2009a; Polat & Sagi, 1993; Zenger & Sagi, 1996)—and it decreases with increasing target–flanker distance (Chen & Tyler, 2008; Polat & Sagi, 1993, 1994a, 1994b). Thus, it has been suggested that the human PF size in the fovea coincides with the suppression zone and is about 2λ to 3λ (Mizobe, Polat, Pettet, & Kasamatsu, 2001; Polat, 1999; Polat & Norcia, 1996; Polat & Sagi, 1994b; Polat & Tyler, 1999; Watson et al., 1983; Zenger & Sagi, 1996). Thus, masking effects from target–flanker separations of 2λ to 3λ or less may be considered as integration (or summation) within the same PF (pattern masking). 
However, the above results were obtained from the fovea, whereas other studies indicated that lateral masking in the periphery is different from that in the fovea, suggesting that collinear facilitation is absent, rare, not consistent, or even replaced by suppression (Giorgi et al., 2004; Levi & Carney, 2011; Levi, Hariharan, et al., 2002; Shani & Sagi, 2005; Williams & Hess, 1998; Zenger-Landolt & Koch, 2001). However, our recent study (Lev & Polat, 2011) showed that collinear facilitation and suppression behave similarly when the target–flanker separation is properly scaled (explained next). 
Lateral interactions at the periphery
One of the main differences between masking and crowding comes from the observation that the effect of crowding found at the periphery is robust, whereas masking is pronounced at the fovea (Levi, 2008; Levi & Carney, 2011; Whitney & Levi, 2011). This effect can be explained by our model as underestimating the size of the underlying PF at the periphery. In our recent study (Lev & Polat, 2011) we found consistent facilitation at the periphery, showing that similar spatial rules (suppression and facilitation) apply for the fovea and the periphery when estimating the size of the human PF. The results indicate that the size of the PF is larger at the periphery, consistent with the principle of the cortical magnification factor (Daniel & Whitteridge, 1961; Dow, Snyder, Vautin, & Bauer, 1981; Duncan & Boynton, 2003; Levi, Klein, & Aitsebaomo, 1985). Thus, in order to obtain the effect of collinear facilitation, the target–mask separation needs to be separated more than at the fovea, reaching 5λ to 6λ at 4° of eccentricity. This finding may explain why previous studies using target–flanker separations not optimal for the periphery failed to find the expected facilitation effect (Levi, Klein et al., 2002; Shani & Sagi, 2005; Williams & Hess, 1998; Zenger-Landolt & Koch, 2001). 
Configuration specificity of lateral interactions
Lateral interactions are configuration specific. For the collinear configuration, the effect of facilitation is found outside the PF and the suppressive effect is from inside the PF. In contrast, for the orthogonal configuration, no modulatory effect is found outside the PF, which, however, may be transformed to a facilitatory effect inside the PF (Knierim & van Essen, 1992; Lev & Polat, 2011; J. B. Levitt & Lund, 1997; Polat et al., 1998). Therefore, many studies used the orthogonal configuration as a reference to indicate no modulatory effect from outside the RF when investigating lateral interactions (Lev & Polat, 2011; Maniglia et al., 2011, 2015; Shani & Sagi, 2005; Zomet, Amiaz, & Polat, 2007). Thus, two opposing types of modulations are found at the border of the PF: Collinear facilitation is transformed from facilitation to suppression, whereas in the orthogonal configuration it is transformed from having no effect to facilitation. Therefore, there are opposite transitions for both configurations, resulting in no effect at the border of the PF. Thus, a ratio of collinear/orthogonal = 1 can be used to accurately estimate the border of the PF. 
E, I, and E/I
Lateral interactions are both excitatory and inhibitory (Bair, Cavanaugh, & Movshon, 2003; Cass & Spehar, 2005; Chen, Kasamatsu, Polat, & Norcia, 2001; Fitzpatrick, 2000; Kasamatsu et al., 2010; Kisvarday, Toth, Rausch, & Eysel, 1997; Levitt & Lund, 1997; Mizobe et al., 2001; Polat & Sagi, 1993, 2006; Weliky, Kandler, Fitzpatrick, & Katz, 1995). Results suggest that the contextual effects are mediated by the long-range horizontal connections formed by pyramidal neurons within V1 (Bolz & Gilbert, 1989; Gilbert & Wiesel, 1989; Hirsch & Gilbert, 1991; Kisvarday et al., 1997; Mizobe et al., 2001; Weliky et al., 1995) but that feedback from higher cortical areas may also play a role (Gilbert & Li, 2013). The emerging results from these studies indicate that E is more selective and is received between nonoverlapping neurons connected by long-range connections and that these neurons have similar optimal orientation selectivity that tends to make preferred connections along the collinear configurations. The inhibitory effect resulted from local interactions; it is less selective for the stimulus parameters. However, orthogonal flankers may produce different effects than collinear flankers produce at close proximity or from inside the receptive field, being facilitative or having no effect (Knierim & van Essen, 1992; Levitt & Lund, 1997; Polat et al., 1998). See Mizobe et al. (2001) and Polat (1999) for a schematic description of the model in the spatial domain. 
In the temporal domain, it was shown that the time constant of the I is rapid and transient (Bair et al., 2003; Cass & Alais, 2006; Polat & Sagi, 2006). In contrast, the time constant of the E is relatively delayed and sustained (Bair et al., 2003; Cass & Spehar, 2005; Fitzpatrick, 2000; Grinvald, Lieke, Frostig, & Hildesheim, 1994; Kapadia et al., 1999; Polat & Sagi, 2006) and may abrogate the I with increasing presentation time (Polat & Sagi, 2006). Our model assumes that the masking effect results from I (local and lateral). See Polat and Sagi (2006) and Sterkin and Polat (2008) for a more detailed description of the dynamics of the E and I. 
Crowding is also affected by the presentation times (Chung & Mansfield, 2009; Lev, Yehezkel et al., 2014; Levi & Klein, 1985; Malania et al., 2007; Tripathy et al., 2014; Westheimer & Hauske, 1975). Thus, it may be affected by the dynamics of lateral interactions and is considered as I (Levi, 2008; Polat & Sagi, 1993). 
How contrast affects E/I
Spatial factors
The masking effect (suppression) increases with increasing target contrast (Polat, 1999; Polat & Sagi, 1994b, 2006; Zenger & Sagi, 1996). Facilitation is found for low-contrast targets. Crowding is also affected by contrast and the relationships between the target and flanker's contrast (Coates, Chin, & Chung, 2013; Coates & Levi, 2014; Kooi, Toet, Tripathy, & Levi, 1994; Rashal & Yeshurun, 2014; Siderov, Waugh, & Bedell, 2013; Simmers, Gray, McGraw, & Winn, 1999), being higher for high contrast of both the target and flanker. 
The effect of suppression is explained in psychophysical studies (Adini & Sagi, 2001; Adini et al., 1997; Chen & Tyler, 2002; Polat, 1999; Polat & Sagi, 1994b, 2006; Zenger & Sagi, 1996), physiological studies (Kasamatsu et al., 2010; Levitt & Lund, 2002; Mizobe et al., 2001; Polat et al., 1998), and modeling (Chen et al., 2001; Stemmler et al., 1995; Sterkin & Polat, 2008). It results from the activity of nonoptimal neurons (noise) in the target's vicinity, leading to I. Thus, an important factor that determines the shift from facilitation to suppression is the activity level in the network. Near the contrast threshold, only neurons that optimally respond to the stimulus features are driven by the stimulus (Pelli, 1985; Polat et al., 1998; Stemmler et al., 1995), thus minimizing the noise or the suppression and consequently maximizing the facilitation. 
The size of the receptive field is larger for low contrast and becomes smaller with increasing contrast, most likely due to increasing I from the vicinity of the receptive field (Cavanaugh et al., 2002a, 2002b; Fitzpatrick, 2000; Kapadia et al., 1999; Kasamatsu et al., 2010; Sceniak et al., 1999). On the other hand, for a collinear configuration, there is a level of a target's contrast where the E/I reaches a level that determines the optimal size of the receptive field (Kasamatsu et al., 2010). Thus, near the contrast threshold the E/I is relevant for specific spatiotemporal parameters and is indicative of the underlying PF that processes these visual stimuli. 
Temporal factors
The response latency increases for deceasing contrast (Albrecht & Hamilton, 1982; Kapadia et al., 1999; Polat et al., 2007; Sterkin & Polat, 2008). Thus, the strongest facilitation is revealed near the contrast threshold for an optimal combination of spatiotemporal parameters matching the target's response time and the slow propagation time of E (Polat & Sagi, 2006; Polat et al., 2007). Otherwise, there might be either no modulatory effect or even suppression for shorter presentation times. 
Spatiotemporal domains and PF
Lateral masking and crowding are critically dependent on the target–flanker separation. The masking (suppressive) and crowding are found for a certain range of target–flanker separations known as the critical distance, which increases with increasing eccentricity. In the fovea the masking effect is found at distances less than 3λ (Polat, 1999; Polat & Sagi, 1993, 2006; Woods et al., 2002), whereas crowding is found at very small separations of less than 5 arcmin (Danilova & Bondarko, 2007; Flom et al., 1963; Pelli et al., 2004; Whitney & Levi, 2011). A larger critical distance is found at the periphery for masking (Lev & Polat, 2011) and crowding (Levi & Carney, 2011; Pelli & Tillman, 2008; Whitney & Levi, 2011). Facilitation (in collinear configurations) is found beyond the critical distance for masking (Polat, 1999; Polat & Sagi, 1993, 2006; Woods et al., 2002; Zenger & Sagi, 1996), whereas it is rare for crowding (Chung et al., 2001). 
Inside the PF, the I consists of input from the flankers combined with the local processing of the target. Therefore, the lateral E and the local I interact within the time window that the target is processed, resulting in a complex outcome that depends on several spatiotemporal parameters such as the presentation time, the target–flanker separation, and contrast. However, for target–flanker separations larger than 2λ, because the propagation time of the lateral E is relatively slow (arriving after the peak of the local I near the contrast detection threshold), the suppressive effect is transformed to facilitation. 
Figure 1
 
Example of stimuli used in our experiments. (a) Collinear and orthogonal. (b) Lateral masking with different target–mask separations. The lateral masking consisted of a target in the presence of two collinear flankers. At the top, each separation is indicated by λ (wavelength) units. (c) Single and crowded letters used to measure the crowding effect at the fovea. (d) Crowded letters with larger interletter spacing used to measure crowding in the periphery. (e) Temporal crowding with letters used in the periphery.
Figure 1
 
Example of stimuli used in our experiments. (a) Collinear and orthogonal. (b) Lateral masking with different target–mask separations. The lateral masking consisted of a target in the presence of two collinear flankers. At the top, each separation is indicated by λ (wavelength) units. (c) Single and crowded letters used to measure the crowding effect at the fovea. (d) Crowded letters with larger interletter spacing used to measure crowding in the periphery. (e) Temporal crowding with letters used in the periphery.
Figure 2
 
Estimate of the PF. (a) Average d′ (y axis) against target–flanker separation (λ units, x axis) of the Gabor patches. The red line and closed triangles denote the orthogonal configuration, and the blue line with filled diamonds denotes the collinear configuration (see Figure 1a). (b) The same as for panel a but for phit. (c) The d′ ratio (collinear/orthogonal) y axis as a function of target–flanker separation (λ units, x axis). Each filled blue circle denotes the ratio for one subject. The solid line denotes the average of the data points. (d) The same as for panel c but using phit. The error bars denote the standard error of the mean (n = 20).
Figure 2
 
Estimate of the PF. (a) Average d′ (y axis) against target–flanker separation (λ units, x axis) of the Gabor patches. The red line and closed triangles denote the orthogonal configuration, and the blue line with filled diamonds denotes the collinear configuration (see Figure 1a). (b) The same as for panel a but for phit. (c) The d′ ratio (collinear/orthogonal) y axis as a function of target–flanker separation (λ units, x axis). Each filled blue circle denotes the ratio for one subject. The solid line denotes the average of the data points. (d) The same as for panel c but using phit. The error bars denote the standard error of the mean (n = 20).
Figure 3
 
Correlation between the estimates of the PF. The y axis denotes the estimates using phit, and the x axis denotes the estimates using d′. Each filled circle denotes one subject. The solid line is the linear fit for the data (n = 20).
Figure 3
 
Correlation between the estimates of the PF. The y axis denotes the estimates using phit, and the x axis denotes the estimates using d′. Each filled circle denotes one subject. The solid line is the linear fit for the data (n = 20).
Figure 4
 
The crowding effect using letter identification. The crowding effect (crowded minus a single letter) in logMar units (y axis) against two and four interletter spacings. The crowding effect is much larger for two interletter spacing. The error bars denote the standard error of the mean (n = 20).
Figure 4
 
The crowding effect using letter identification. The crowding effect (crowded minus a single letter) in logMar units (y axis) against two and four interletter spacings. The crowding effect is much larger for two interletter spacing. The error bars denote the standard error of the mean (n = 20).
Figure 5
 
The correlation between crowding and the PF. The crowding condition y axis (panels a and b; degrees, in letter spacing, center to center) and the crowding effect (panels c and d; crowded minus single letter) against the estimated size of the PF (x axis). The top axis is in degrees, and the bottom axis is in λ units. Each data point denotes the estimated size of the PF for each subject. The solid line is the linear regression line; the correlation is indicated by r in each panel. Panels a and c used d′, whereas panels c and d used the phit measurement (n = 20).
Figure 5
 
The correlation between crowding and the PF. The crowding condition y axis (panels a and b; degrees, in letter spacing, center to center) and the crowding effect (panels c and d; crowded minus single letter) against the estimated size of the PF (x axis). The top axis is in degrees, and the bottom axis is in λ units. Each data point denotes the estimated size of the PF for each subject. The solid line is the linear regression line; the correlation is indicated by r in each panel. Panels a and c used d′, whereas panels c and d used the phit measurement (n = 20).
Figure 6
 
Temporal crowding. (a) The temporal crowding effect (y axis) in logMAR units for two and four interletter spacings at 4° of eccentricity. The correlation between the effect of temporal masking in visual angles (y axis) against the estimated size of the PF (x axis, top axis in degrees, bottom axis in λ units). Each data point is the estimated size for each subject. (b) Using d′ as a measurement. (c) Using phit as a measurement. The error bars denote the standard error of the mean (n = 16).
Figure 6
 
Temporal crowding. (a) The temporal crowding effect (y axis) in logMAR units for two and four interletter spacings at 4° of eccentricity. The correlation between the effect of temporal masking in visual angles (y axis) against the estimated size of the PF (x axis, top axis in degrees, bottom axis in λ units). Each data point is the estimated size for each subject. (b) Using d′ as a measurement. (c) Using phit as a measurement. The error bars denote the standard error of the mean (n = 16).
Figure 7
 
Letter identification (crowding effect) in the fovea as a function of presentation time. (a) Percentage correct (y axis) against the presentation time in milliseconds (x axis). The blue line and filled circles denote the target alone, and the red line and red filled circles denote the crowded conditions. The effect of crowding is significant for all presentation times but is maximal for the shorter ones. (b) The effect of the crowding effect (reduction in the percentage correct, y axis) as a function of presentation time (milliseconds). The solid line denotes the linear fit of the data points (r = 0.93, r2 = 0.864), showing that crowding is apparent at more than 240 ms. The error bars denote the standard error of the mean (n = 13).
Figure 7
 
Letter identification (crowding effect) in the fovea as a function of presentation time. (a) Percentage correct (y axis) against the presentation time in milliseconds (x axis). The blue line and filled circles denote the target alone, and the red line and red filled circles denote the crowded conditions. The effect of crowding is significant for all presentation times but is maximal for the shorter ones. (b) The effect of the crowding effect (reduction in the percentage correct, y axis) as a function of presentation time (milliseconds). The solid line denotes the linear fit of the data points (r = 0.93, r2 = 0.864), showing that crowding is apparent at more than 240 ms. The error bars denote the standard error of the mean (n = 13).
Figure 8
 
Suppression zone in the fovea for a short duration time of 30 ms. (a) Masking thresholds in log units (y axis), as measured for collinear configurations (see Figure 1) as a function of the target–flanker separation in λ units (x axis). The solid red line indicates the threshold of the target alone. The data points and the error bars are taken from single measures and are presented at each location for convenience and for comparison with the lateral interaction data. There is a large masking effect for short distances and no facilitation effect at 3λ and 4λ. The error bars denote the standard error of the mean (n = 13).
Figure 8
 
Suppression zone in the fovea for a short duration time of 30 ms. (a) Masking thresholds in log units (y axis), as measured for collinear configurations (see Figure 1) as a function of the target–flanker separation in λ units (x axis). The solid red line indicates the threshold of the target alone. The data points and the error bars are taken from single measures and are presented at each location for convenience and for comparison with the lateral interaction data. There is a large masking effect for short distances and no facilitation effect at 3λ and 4λ. The error bars denote the standard error of the mean (n = 13).
Figure 9
 
Correlation between letter crowding and Gabor masking at different target flanker separations. Crowding effect (percentage of correct reduction) y axis for 30 (top), 60 (middle), and 120 (bottom) ms for a target–flanker separation of 2λ (left), 3λ (middle), and 4λ (left). The x axis denotes the masking threshold as a function of target–flanker distance. Each data point denotes data for one subject, and the solid line denotes the correlation fit of the dots. The correlation increases with increasing presentation time, but it is always highest for 3λ (n = 13).
Figure 9
 
Correlation between letter crowding and Gabor masking at different target flanker separations. Crowding effect (percentage of correct reduction) y axis for 30 (top), 60 (middle), and 120 (bottom) ms for a target–flanker separation of 2λ (left), 3λ (middle), and 4λ (left). The x axis denotes the masking threshold as a function of target–flanker distance. Each data point denotes data for one subject, and the solid line denotes the correlation fit of the dots. The correlation increases with increasing presentation time, but it is always highest for 3λ (n = 13).
Figure 10
 
The correlation between the critical time to escape from crowding and masking. For each subject the critical duration in milliseconds to escape from crowding (no crowding, y axis) is presented against masking thresholds (log units) for different target–flanker distances: (a) 2λ, (b) 3λ, and (c) 4λ. Each data point denotes data for one subject. A very high correlation was found for 3λ (n = 13).
Figure 10
 
The correlation between the critical time to escape from crowding and masking. For each subject the critical duration in milliseconds to escape from crowding (no crowding, y axis) is presented against masking thresholds (log units) for different target–flanker distances: (a) 2λ, (b) 3λ, and (c) 4λ. Each data point denotes data for one subject. A very high correlation was found for 3λ (n = 13).
Table 1.
 
Statistics for the correlation between letter crowding and Gabor masking (Figure 9).
Table 1.
 
Statistics for the correlation between letter crowding and Gabor masking (Figure 9).
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×