Free
Article  |   May 2013
Spatial structure of contextual modulation
Author Affiliations
  • I. Mareschal
    School of Psychology & Australian Centre of Excellence in Vision Science, The University of Sydney, Sydney, New South Wales, Australia
    School of Biological and Chemical Sciences, Psychology, Queen Mary University of London, London, UK
    mareschal@gmail.com
  • C. W. G. Clifford
    School of Psychology & Australian Centre of Excellence in Vision Science, The University of Sydney, Sydney, New South Wales, Australia
    colin.clifford@sydney.edu.au
Journal of Vision May 2013, Vol.13, 2. doi:https://doi.org/10.1167/13.6.2
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      I. Mareschal, C. W. G. Clifford; Spatial structure of contextual modulation. Journal of Vision 2013;13(6):2. https://doi.org/10.1167/13.6.2.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract
Abstract
Abstract:

Abstract  Contextual effects are ubiquitous in vision and provide a means for detectors with localized receptive fields to encode global properties of a stimulus. Although the nature of the neural connections is complex, the majority of evidence supports the Gestalt idea of collinearity; interactions are greatest when the target and surround orientations are spatially aligned to form a contour. Here we create a novel stimulus that simultaneously probes all areas around a detector to determine which spatial positions influence perception in human observers. We find that the surrounding spatial areas that contribute most to contextual effects for our perception of orientation and motion are not confined to a specific location. Rather our results reveal that human perception displays some interobserver variability in the weighting of detector interactions that is largely independent of collinear structure. We propose that these more extensive surround stimuli reveal how complex visual structure may modulate performance in a manner that is not easily predictable using more conventional stimuli.

Introduction
The world around us appears coherent even though detectors in our visual system only convey information about small localized areas of space known as their receptive fields (Enroth-Cugell & Robson, 1966; Hubel & Wiesel, 1968). Despite this piecemeal representation of the visual scene, the human brain reconstructs a faithful and coherent representation of the complex visual images it receives. A clue to understanding how this is achieved is the discovery of contextual effects whereby neighboring detectors interact to signal the global properties of a stimulus (Gilbert & Wiesel, 1990; Levitt & Lund, 1997; Nelson & Frost, 1978; Nothdurft, Gallant, & van Essen, 1999; Sillito, Grieve, Jones, Cudeiro, & Davis, 1995; Walker, Ohzawa, & Freeman, 1999). In human psychophysics the majority of evidence supports the importance of collinearity; interactions are greatest when the target and surround orientations are spatially aligned, evidenced in a variety of phenomena such as contour integration (Field, Hayes, & Hess, 1993; Kovacs & Julesz, 1993), crowding (Mareschal, Morgan, & Solomon, 2010; Toet & Levi, 1992), contrast facilitation (Polat & Sagi, 1993), and orientation illusions (Kapadia, Westheimer, & Gilbert, 2000; Schwartz, Sejnowski, & Dayan, 2006). Indeed, in their pivotal contour paper, Field et al. (1993) proposed the concept of a local association field, whereby detectors tuned to similar orientations and aligned “end to end” (rather than side to side) had enhanced interactions to signal continuity. In their model, integration can occur between filters that are not perfectly aligned, as long as their relative orientations do not exceed 50°–60°. Recently, there has been a burgeoning interest in linking the tuning properties of detectors to the statistics of natural images (Felsen & Dan, 2005; Rust & Movshon, 2005). Given that natural images are rich in highly structured statistical properties that define edges, the selectivity of contextual effects for collinearity might provide a neural substrate optimal to process natural scenes (Schwartz et al., 2006; Sigman, Cecchi, Gilbert, & Magnasco, 2001). There are two caveats, however, to the above findings; the first is that most of the stimuli used to test this idea have strong second-order orientation cues that bias results (Morgan & Baldassi, 1997). The second is that the presence of additional surround stimuli along the sides of a target can modulate the strength of collinear alignment in both contrast facilitation (Solomon & Morgan, 2000) and contour detection (Dakin & Baruch, 2009), suggesting that conventional stimulus configurations do not capture the complex layout of target–surround interactions. 
Physiological experiments have revealed that both classical and non-classical receptive fields can have nonhomogenous layouts (e.g., Nishimoto, Ishida, & Ohzawa, 2006; Walker et al., 1999). Using a noise stimulus larger than the neuron's receptive field, Nishimoto et al. (2006) measured properties of subfields within the receptive field. Here we were motivated by a similar idea to examine contextual interactions underlying human perception. Specifically, we assessed which surround areas (spatial hotspots) influence our perception of a target when the entire surround is stimulated. 
Materials and methods
Apparatus and observers
A Dell Optiplex computer running Matlab (MathWorks Ltd.) and incorporating elements of the PsychToolbox (Brainard, 1997) was used for stimulus generation and recording subjects' responses. Stimuli were displayed on a calibrated Diamond Digital monitor (1024 × 768 pixels; 85 Hz) driven by the computer's Radeon graphics card and viewed binocularly. At the viewing distance of 57 cm, 1 pixel subtended 2.1 arcmin. The authors and six naïve observers served as subjects. All experiments adhered to the Declaration of Helsinki guidelines. 
Stimulus and procedure
We created a novel surround that consisted of irregularly shaped abutting windows that contained one of two opposite stimuli such that globally the stimulus averaged to an uninformative mean for the task but that, locally, sections were dominated by one of the two stimuli. Borders between the windows were used to create a new mask on each trial and were defined by the zero crossings of low-pass filtered white Gaussian noise (Figure 1b). Their shapes provided no cue to orientation thus bypassing problems associated with second-order information. We hypothesized that if certain areas of the surround (spatial hotspots) exert greater influence on the target, then the observer's perception of the target will be determined by how the local stimulus statistics map onto these areas (Figure 1c). 
Figure 1
 
Sample stimuli and the two types of analysis performed. (a) Examples of vertical and horizontal tilt illusion stimuli. (b) A new bipolar mask defined by the zero crossings in low-pass filtered noise is generated on each trial. (c) The black and white sections of the mask contain oppositely oriented gratings and the target (horizontal or vertical) grating is located in the center of the stimulus. (d) Masks that led to the observer responding “CCW” are summed and subtracted from the sum of masks that led the observer to respond “CW,” resulting in a hotspots map where pixel intensity is proportional to strength of influence. (e) Example of the conventional White's illusion and (f) the masked version of White's illusion. (g) Masks on the left side that led to the observer reporting the left as appearing lighter were summed together and polarity-inverted masks on the right side that were reported to appear lighter were summed together. The compiled “left lighter” and “right lighter” images were added into a composite hotspots map where pixel intensity is proportional to influence.
Figure 1
 
Sample stimuli and the two types of analysis performed. (a) Examples of vertical and horizontal tilt illusion stimuli. (b) A new bipolar mask defined by the zero crossings in low-pass filtered noise is generated on each trial. (c) The black and white sections of the mask contain oppositely oriented gratings and the target (horizontal or vertical) grating is located in the center of the stimulus. (d) Masks that led to the observer responding “CCW” are summed and subtracted from the sum of masks that led the observer to respond “CW,” resulting in a hotspots map where pixel intensity is proportional to strength of influence. (e) Example of the conventional White's illusion and (f) the masked version of White's illusion. (g) Masks on the left side that led to the observer reporting the left as appearing lighter were summed together and polarity-inverted masks on the right side that were reported to appear lighter were summed together. The compiled “left lighter” and “right lighter” images were added into a composite hotspots map where pixel intensity is proportional to influence.
Orientation
The stimuli consisted of a concentric annular surround (outer diameter 8.9°) separated by a 2 pixel wide gap from the circular target (diameter 2.2°) that contained a 1.84 c/° grating that, in separate runs, was either horizontal or vertical. The surround consisted of irregularly shaped abutting windows whose shapes were defined by a bipolar mask (Figure 1b). Borders between the windows were defined by the zero crossings of low-pass filtered white Gaussian noise that was recreated on each trial (using one of two filter cutoffs: 8 c/image and 6 c/image. Observer IM tested an additional two filter cutoffs of 10 c/image and 4 c/image.). The polarity value of a window in the mask determined the orientation of the 1.84 c/° grating contained within it; in white sections (pixel value = +1) the grating was tilted +15° from the target, and in black sections (pixel value = −1) the grating was tilted at −15° from the target (Figure 1c). Both orientations covered the same total area of the surround. All gratings were 30% contrast and their spatial phases were independently varied on each trial. Observers fixated a central square (1 s) that was extinguished during stimulus presentation (150 ms) then reported whether the target appeared tilted clockwise (CW) or counterclockwise (CCW) of vertical or horizontal using the keys “a” and “l,” respectively. Observers completed a minimum of 2000 trials per condition in blocks of 100 trials, collected over several weeks. A control condition was performed using surround orientations (+60° and −60°) that do not elicit a tilt illusion (e.g., Westheimer, 1990). 
Lightness
The central target was an elongated vertical bar set to the mean grey (0.84° × 0.28°) within a square stimulus that subtended 4.4°. The surround consisted of windows (as above, with filter cutoff of 12 c/image) onto one of two polarity-reversed, vertical, square-wave gratings whose spatial frequency matched the width of the target bar (1.78 c/°). The stimulus was presented in two spatial locations, to the right of center and in opposite polarity to the left of center (inner edges separated by 4.2°). The observer was instructed to scan (free view) the stimulus for 1 s and then reported with a key press which side appeared lighter, completing 500 trials in blocks of 100 trials. 
Hotspots maps were constructed from the bipolar masks in the following manner: Masks on trials where the observer judged the left target as lighter were summed together and masks on trials where the observer judged the right target as lighter were sign inverted and then added to the summed left lighter masks. 
Motion
The central target (diameter 2.8°) contained isotropic fractal noise scaled to cover the full range of lightness values that drifted upwards (4.0°/sec). The surround (outer diameter 15.4°) contained windows (filter cutoff 16 c/image) onto different sets of isotropic fractal noise that drifted ± 36° from the direction of the target (3.6°/sec). Observers fixated a central square (1 s) that was extinguished during stimulus presentation (300 ms) then reported whether the direction of the center was CW or CCW of upwards motion, completing 2000 trials in blocks of 100. 
White's predictions
The spatial resolution of our technique is inevitably limited by the granularity of the mask. Using a square target rather than an elongated bar fails to elicit a compelling illusion, suggesting that the illusion may arise in large part due to the elongated nature of the stimulus. In order to determine whether this was the case, we created a series of predictive hotspots maps that we cross correlated with each observers' data. To generate the predicted data for the White's illusion experiment, one thousand random bipolar (black/white) masks were generated. Those in which the central pixel was black were negated and then summed with those in which the central pixel was already white to produce a summed image of one thousand masks each centered on a white pixel. The resulting image was taken as an estimate of the effective point spread function of our classification images arising from spatial correlations within the mask. 
Results
Figure 2 shows the surround areas of influence (hotspots maps) in the tilt illusion stimulus when the observers' task was to report whether the central target appeared tilted CW or CCW of vertical (Figure 2a) or horizontal (Figure 2b). Masks used to create the tilt illusion stimulus were stored on each trial into the CW or CCW category, based on the observer's response to the target. At the end of all trials the masks in the CCW category were summed together and subtracted from the sum of the masks in the CW category to build the hotspots maps. 
Figure 2
 
Orientation hotspots maps obtained for two different mask sizes. Sample stimuli are shown on the left for a horizontal target (top) and a vertical target (bottom). Observers' hotspots maps are in the middle panels; the top two rows were obtained using a mask generated with a filter cutoff of 6 c/image (0.67 c/°), the bottom three rows were obtained with a mask generated with a filter cutoff of 8 c/image (0.90 c/°). Three conditions were tested: (a) a vertical condition, (b) a horizontal condition, (c) the difference map between vertical and horizontal conditions, and (d) a control condition where the surround gratings were ± 60° from the target orientation and produced no tilt illusion (horizontal target for IM and RM, vertical target for EG and GV). Significant pixels are shown in the right-hand panels of the figure for the corresponding conditions. The bottom row shows the summed hotspots maps (observers IM, RM, EG) for the vertical condition (left panels) and the horizontal condition (right panels). Summed pixels maps are the sum of pixel values within one wedge of a circular-edged bow tie (bandwidth 30°) using two different circle radii to create the bow tie (color of data points indicates the radius used as shown on the hotspots maps).
Figure 2
 
Orientation hotspots maps obtained for two different mask sizes. Sample stimuli are shown on the left for a horizontal target (top) and a vertical target (bottom). Observers' hotspots maps are in the middle panels; the top two rows were obtained using a mask generated with a filter cutoff of 6 c/image (0.67 c/°), the bottom three rows were obtained with a mask generated with a filter cutoff of 8 c/image (0.90 c/°). Three conditions were tested: (a) a vertical condition, (b) a horizontal condition, (c) the difference map between vertical and horizontal conditions, and (d) a control condition where the surround gratings were ± 60° from the target orientation and produced no tilt illusion (horizontal target for IM and RM, vertical target for EG and GV). Significant pixels are shown in the right-hand panels of the figure for the corresponding conditions. The bottom row shows the summed hotspots maps (observers IM, RM, EG) for the vertical condition (left panels) and the horizontal condition (right panels). Summed pixels maps are the sum of pixel values within one wedge of a circular-edged bow tie (bandwidth 30°) using two different circle radii to create the bow tie (color of data points indicates the radius used as shown on the hotspots maps).
For three of the four observers (GV excepted) the hotspots maps for the vertical and horizontal targets appear similar across the two conditions (Figures 2a & b), highlighting a lack of dependency on the stimulus configuration. This suggests no greater weighting of interactions at the ends of the stimulus, contrary to what would be predicted from Kapadia et al. (2000) who report that collinear interactions (flankers aligned end to end) produced greater repulsion than side to side (which sometimes produced weak attraction) using a three bar stimulus. This is readily seen in the vertical/horizontal difference maps (Row C) that contain no structure. Observer GV displayed a difference between the vertical and horizontal target but in a direction opposite to the collinearity prediction. The control experiment (Row D) where observers did not experience a tilt illusion failed to produce any structure in the hotspots maps. 
To aid visualization of the hotspots maps, we also show significant pixels maps (not corrected for multiple comparisons), calculated relative to the control condition. For each observer, we calculated the mean and standard deviation of the hotspots maps for their control condition and pixel values in the horizontal, vertical, control, and horizontal–vertical conditions that were significantly different (p = 0.001) were set to white. 
Hotspots maps summed across three observers (bottom row) show uniform structure around the target. In order to examine whether the structure around the target was isotropic, we summed pixel values within small subregions of the hotspots maps. If the maps are isotropic, the summed pixel values should be similar across wedge positions. Two different wedge radii are used, overlaid on the summed hotspots maps using a 30° bandwidth wedge. For the two sizes tested there is no difference in summed pixel values within the subregions for the summed horizontal or vertical maps; the plots are circular (circular variances ranged from 0.84 to 0.92). 
The finding that the hotspots maps reveal some differences between observers is consistent with earlier reports of interobserver variability in classification images (Gold, Murray, Bennett, & Sekuler, 2000; Rajashekar, Bovik, & Cormak, 2006). To determine how much variation there was between observers we performed the following analysis: We randomly selected half of an observer's data (10 runs) to build a subset map and used the other half (10 runs) to build a second subset map. We then cross correlated the first subset map with the second one for the same observer and with the subset maps of different observers. When subset maps were tested within the same observer, the average correlation (across conditions and observers) was 0.36 ± 0.12; when they were tested across observers, the average correlation was reduced to 0.24 ± 0.08. It appears that the observer's decision process throughout the trials was fairly stable and that there is some overlap between different observers' hotspots maps. This reflects the fact that all observers showed maximum influence for areas immediately adjacent to the target. 
The lack of structure in the control condition hotspots maps when observers never experienced a tilt illusion (Row D) indicates that they are not simply reporting the orientation of parts of the surround during the orientation task. When mask size was varied, the larger mask led to a coarser sampling of the surround space, but the hotspots maps were defined by structure in similar sections (IM, Rows 1 & 4). To examine in detail how the ratio of mask to target sizes may alter the hotspots maps, observer IM performed the experiment for horizontal and vertical targets using two additional target and mask sizes (Figure 3). 
Figure 3
 
Orientation hotspots maps for observer IM at four different mask sizes (ordinate) and three different target sizes (abscissa) in the vertical and horizontal conditions. Significant pixels are shown next to the maps.
Figure 3
 
Orientation hotspots maps for observer IM at four different mask sizes (ordinate) and three different target sizes (abscissa) in the vertical and horizontal conditions. Significant pixels are shown next to the maps.
This observer displays similar structure in the hotspots maps located mainly along the upper section of the surround across a range of target/mask sizes and orientations. When the masks are very large, the coarse sampling of the surround limits the effectiveness of the technique. Similarly for very small mask sizes, reducing the windows on the surround will start to introduce noise at a fine spatial scale. The significant pixels maps here were produced by calculating the mean and standard deviations of the distribution of pixel values for each hotspot map. Pixels greater than ± 2 standard deviations from the mean were set to white, while those that did not exceed this criterion were set to black. 
Figure 4A (left panels) shows the hotspots maps using White's illusion, a phenomenon where identical grey patches placed on the black and white bars of a square wave grating appear to differ in lightness (White, 1979; see Figure 1e). Masks that led the observer to respond “left lighter” are summed and added to the polarity inverted masks that led the observer to respond “right lighter.” The corresponding significant pixels for the summed masks are shown on the right. White's illusion is reported to be due to assimilation from the flanks or contrast from the ends. For all observers, the hotspots maps reveal structure localized along the flanks of the stimulus that could arise due to the elongated nature of the stimulus. In order to examine this we created a series of combinatorial predictive hotspots maps that reflected a weighted sum of side flank assimilation and end flank contrast. This was done by making two predictive maps, one for side flank assimilation only and one for end contrast only by convolving the estimated point spread function of the filter used to create the masks (Figure 4B[b]) with the two alternate configurations: one for assimilation (Figure 4B[a]) and one for contrast (Figure 4B[c]). The two maps were then summed in different ratios to create the series of combinatorial predictive maps. Cross correlating these predictive maps with the observers' hotspots maps (Figure 4B[g]) reveals that the addition of contrast influences does not increase the correlation obtained using only assimilation for any of the four observers. This results from the nature of the stimulus; comparison of the predictive hotspots maps driven solely by assimilation (Figure 4B[d]) or by equal contribution of assimilation and contrast (Figure 4B[e]) are very similar. 
Figure 4
 
Lightness (White's illusion) hotspots maps and predictions. (A) White's illusion hotspots maps obtained by summing the masks (left) and significant pixels (p = 0.0001; right) in four observers, obtained using a vertical bar target with a filter cutoff of 12 c/image (2.73 c/°) to create windows onto the background. (B) Predictive hotspots maps were built by convolving the estimated point spread function of the filter used to create the mask with one of two configurations: assimilation only along the sides (a) and contrast only along the ends (c), resulting in predictive hotspots map for assimilation only (d) and contrast only (f). The relative contributions of the assimilation and contrast maps were varied to construct a set of combinatorial predictive maps (equal contribution shown in [e]). (C) Combinatorial predictive maps were correlated with each observer's hotspots maps data to determine which ratio of assimilation to contrast produced the greatest correlation, zero is pure contrast and one is pure assimilation (g). For all observers, the addition of contrast influences did not increase the correlations obtained from assimilation only.
Figure 4
 
Lightness (White's illusion) hotspots maps and predictions. (A) White's illusion hotspots maps obtained by summing the masks (left) and significant pixels (p = 0.0001; right) in four observers, obtained using a vertical bar target with a filter cutoff of 12 c/image (2.73 c/°) to create windows onto the background. (B) Predictive hotspots maps were built by convolving the estimated point spread function of the filter used to create the mask with one of two configurations: assimilation only along the sides (a) and contrast only along the ends (c), resulting in predictive hotspots map for assimilation only (d) and contrast only (f). The relative contributions of the assimilation and contrast maps were varied to construct a set of combinatorial predictive maps (equal contribution shown in [e]). (C) Combinatorial predictive maps were correlated with each observer's hotspots maps data to determine which ratio of assimilation to contrast produced the greatest correlation, zero is pure contrast and one is pure assimilation (g). For all observers, the addition of contrast influences did not increase the correlations obtained from assimilation only.
As a test of the versatility of our technique, we applied it to direction repulsion (Marshak & Sekuler, 1979) often referred to as the motion analogue of the tilt illusion. Figure 5 plots hotspots maps using these stimuli that clearly display interobserver variability. As with the tilt illusion, there were localized areas in the surround that were dominating observers' experience of direction repulsion, but these were not associated with any particular section of the stimulus. There also does not appear to be any obvious correlation between the areas driving perception for orientation and those underlying the motion repulsion effects in the same observers (compare data for IM and RM in Figures 2 & 5). 
Figure 5
 
Hotspots and significant pixels maps (p < 0.005) obtained for direction repulsion. The stimulus contained a vertical upwards drifting target of fractal noise (1/f amplitude spectrum) with windows of CW and CCW drifting fractal noise defined by the zeros crossings of noise with a filter cut-off frequency of 16 c/image (1.04 c/°).
Figure 5
 
Hotspots and significant pixels maps (p < 0.005) obtained for direction repulsion. The stimulus contained a vertical upwards drifting target of fractal noise (1/f amplitude spectrum) with windows of CW and CCW drifting fractal noise defined by the zeros crossings of noise with a filter cut-off frequency of 16 c/image (1.04 c/°).
Discussion
Using our stimuli we find no evidence of any specific location within the surround that drives contextual effects in motion or orientation, but rather that all areas closest to the target influence perception. There was some variability between observers as to which sections most dominated perception; for example, in the tilt illusion experiment, observer IM appeared to be mainly influenced by structure along the top of the target whereas this section appeared to be irrelevant for RM. We suggest that the use of stimuli tiling the full surround comprehensively probes the complex interactions driving our perception of a target by giving equal stimulation to all areas. Our technique relies on the premise that if the strength of the surround's influence were spatially uniform, the two opposite stimuli activating it would null each other, and the observers' responses would be random; no structure would emerge from the hotspots technique. If the surround's strength is not spatially uniform, the observer's perception of the target will be correlated with the statistics of the stimulus over the hotspot(s). Our result could be reconciled with the findings that noncollinear structure may null or modulate the collinear effects due to low-level interactions occurring along a different axis of the stimulus (Dakin & Baruch, 2009; Solomon & Morgan, 2000). It is likely that the use of simplified configurations in earlier experiments may have boosted the influence of parts of the surround that would otherwise be subject to inhibitory interactions. 
Our method is akin to the classification image technique that is used to examine how observers make decisions about a visual task. The classification image method usually involves presenting a stimulus in white noise to determine how each pixel influenced performance (e.g., Abbey & Eckstein, 2002; Gold et al., 2000; Mareschal, Dakin, & Bex, 2006; Murray, 2011; Neri, Parker, & Blakemore, 1999; Solomon, 2002). A different method based on the rapid presentation of stimuli that act to temporally mask each other has also been successfully used to examine observers' sensitivity to orientation (Mareschal & Clifford, 2012; Ringach, 1997; Wong, Roeber, & Freeman, 2010) and motion (Iyer, Freeman, McDonald, & Clifford, 2011). In our procedure, no noise (either spatial or temporal) is added onto the stimulus; however, the underlying idea is the same: By using masks that differ on every trial in how they reveal sections of two opposite stimuli, we can uncover which areas of the surround are most influential to performance. 
The individual hotspots maps reveal an anisotropic layout of effects for orientation and motion. We propose that an observer's perceptual decision about a target is determined by the interplay of many influences. This includes the neural architecture of interactions but also other factors, such as, for example, how the observer distributes their attention in space or what part of the target they fixate that can act to modulate the strength of connections. The implication from these results is that the processing of very basic features, such as bars and gratings, is not solely a low-level process determined by detectors in early cortical areas but rather is subject to flexible higher level input. This is consistent with a recent finding of higher-level feedback altering low-level detector characteristics (Neri, 2011). 
Our technique is highly versatile; we have successfully applied it to map out contextual interactions in three realms of vision, demonstrating a lack of collinear superiority for motion and orientation. In the case of the lightness stimulus we examined the relative roles of side assimilation and end flank contrast in White's illusion, with our data highlighting the dominance of assimilative effects. It is important to note, however, that this structure arises largely due to the elongated nature of the stimulus. Comparison of the predictive hotspots maps driven solely by assimilation (Figure 4d) or by equal contribution of assimilation and contrast (Figure 4e) are very similar. Our data support previous reports that the illusion is dependent on both the geometrical configuration of the target as well as the luminance of the inducers (Spehar, Clifford, & Agostini, 2002). 
Finally, these results have important implications for early vision models that feature excitatory and inhibitory connections between neurons based on their relative alignments, often used as a first step in extracting edges or contours (e.g., Field et al., 1993; Huang, Jiao, & Jia, 2008; Itti & Koch, 2001; Li, 2000). Our results reveal the need for modulatory factors, either acting directly on the low-level interactions or at a later stage, that would capture the observer variability by boosting the influence of sections of the stimulus while suppressing others. 
Acknowledgments
This research was funded by the Australian Centre of Excellence in Vision Science. CC was supported by a Future Fellowship from the Australian Research Council. 
Commercial relationships: none. 
Corresponding author: Isabelle Mareschal. 
Email: imareschal@gmail.com. 
Address: School of Biological and Chemical Sciences, Psychology, Queen Mary University of London, London, UK. 
References
Abbey C. K. Eckstein M. P. (2002). Classification image analysis: Estimation and statistical inference for two-alternative forced-choice experiments. Journal of Vision, 2 (1): 5, 66–78, http://www.journalofvision.org/content/2/1/5, doi:10.1167/2.1.5. [PubMed] [Article] [CrossRef]
Brainard D. H. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–436. [CrossRef] [PubMed]
Dakin S. C. Baruch N. J. (2009). Context influences contour integration. Journal of Vision, 9 (2): 13, 1–13, http://www.journalofvision.org/content/9/2/13, doi:10.1167/9.2.13. [PubMed] [Article] [CrossRef]
Enroth-Cugell C. Robson J. G. (1966). The contrast sensitivity of retinal ganglion cells of the cat. Journal of Physiology, 187, 517–552. [CrossRef] [PubMed]
Felsen G. Dan Y. (2005). A natural approach to studying vision. Nature Neuroscience, 8, 1643–1646. [CrossRef] [PubMed]
Field D. J. Hayes A. Hess R. F. (1993). Contour integration by the human visual system: Evidence for a local “association field.” Vision Research, 33, 173–193. [CrossRef] [PubMed]
Gilbert C. D. Wiesel T. N. (1990). The influence of contextual stimuli on the orientation selectivity of cells in primary visual cortex of the cat. Vision Research, 30, 1689–1701. [CrossRef] [PubMed]
Gold J. M. Murray R. F. Bennett P. J. Sekuler A. B. (2000). Deriving behavioural receptive fields for visually completed contours. Current Biology, 10, 663–666. [CrossRef] [PubMed]
Huang W. Jiao L. Jia J. (2008). Modeling contextual modulation in the primary visual cortex. Neural Networks, 21, 1182–1196. [CrossRef] [PubMed]
Hubel D. H. Wiesel T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195, 215–243. [CrossRef] [PubMed]
Itti L. Koch C. (2001). Computational modeling of visual attention. Nature Reviews Neuroscience, 2, 194–203. [CrossRef] [PubMed]
Iyer P. B. Freeman A. W. McDonald J. S. Clifford C. W. (2011). Rapid serial visual presentation of motion: short-term facilitation and long-term suppression. Journal of Vision, 11 (3): 16, 1–16, http://www.journalofvision.org/content/11/3/16, doi:10.1167/11.3.16. [PubMed] [Article] [CrossRef] [PubMed]
Kapadia M. K. Westheimer G. Gilbert C. (2000). Spatial distribution of contextual interactions in primary visual cortex and in visual perception. Journal of Neurophysiology, 84, 2048–2062. [PubMed]
Kovacs I. Julesz B. (1993). A closed contour is much more than an incomplete one: effect of closure in figure-ground segmentation. Proceedings of the National Academy of Sciences, USA, 90, 7495–7497. [CrossRef]
Levitt J. B. Lund J. S. (1997). Contrast dependence of contextual effects in primate visual cortex. Nature, 387, 73–76. [CrossRef] [PubMed]
Li Z. (2000). Pre-attentive segmentation in the primary visual cortex. Spatial Vision, 13, 25–50. [CrossRef] [PubMed]
Mareschal I. Clifford C. W. G. (2012). Dynamics of unconscious contextual effects in orientation processing. Proceedings of the National Academy of Sciences, USA, 109, 7553–7558. [CrossRef]
Mareschal I. Dakin S. C. Bex P. J. (2006). Dynamic properties of orientation discrimination assessed by using classification images. Proceedings of the National Academy of Sciences, USA, 103, 5131–5136. [CrossRef]
Mareschal I. Morgan M. J. Solomon J. A. (2010). Attentional modulation of crowding. Vision Research, 50, 805–809. [CrossRef] [PubMed]
Marshak W. Sekuler R. (1979). Mutual repulsion between moving visual targets. Science, 205, 1399–1401. [CrossRef] [PubMed]
Morgan M. J. Baldassi S. (1997). How the human visual system encodes the orientation of a texture and why it makes mistakes. Current Biology, 7, 999–1002. [CrossRef] [PubMed]
Murray R. F. (2011). Classification images: A review. Journal of Vision, 11 (5): 2, 1–25, http://www.journalofvision.org/content/11/5/2, doi:10.1167/11.5.2. [PubMed] [Article] [CrossRef] [PubMed]
Nelson J. I. Frost J. (1978). Orientation-selective inhibition from beyond the classic receptive field. Brain Research, 139, 359–365. [CrossRef] [PubMed]
Neri P. (2011). Global properties of natural scenes shape local properties of human edge detectors. Frontiers in Psychology, 2, 1–20. [CrossRef] [PubMed]
Neri P. Parker A. J. Blakemore C. (1999). Probing the human stereoscopic system with reverse correlation. Nature, 695–698.
Nishimoto S. Ishida T. Ohzawa I. (2006). Receptive field properties of neurons in early visual cortex revealed by local spectral reverse correlation. Journal of Neuroscience, 26, 3269–3280. [CrossRef] [PubMed]
Nothdurft H. C. Gallant J. L. van Essen D. C. (1999). Response modulation by texture surround in primate area V1: Correlates of “popout” under anaesthesia. Visual Neuroscience, 16, 15–34. [CrossRef] [PubMed]
Polat U. Sagi D. (1993). Lateral interactions between spatial channels: Suppression and facilitations revealed bilateral masking experiments. Vision Research, 33, 993–999. [CrossRef] [PubMed]
Rajashekar U. Bovik A. C. Cormak L. K. (2006). Visual search in noise: Revealing the influence of structural cues by gaze-contingent classification image analysis. Journal of Vision, 6 (4): 7, 379–386, http://www.journalofvision.org/content/6/4/7, doi:10.1167/6.4.7. [PubMed] [Article] [CrossRef]
Ringach D. (1997). Tuning of orientation detectors in human vision. Vision Research, 38 (7), 963–972.
Rust N. C. Movshon J. A. (2005). In praise of artifice. Nature Neuroscience, 8, 1647–1650. [CrossRef] [PubMed]
Schwartz O. Sejnowski T. J. Dayan P. (2006). A Bayesian framework for tilt perception and confidence. Advances in Neural Information Processing Systems, 1201–1208.
Sigman M. Cecchi G. A. Gilbert C. D. Magnasco M. O. (2001). On a common circle: Natural scenes and Gestalt rules. Proceedings of the National Academy of Sciences, USA, 98, 1935–1940. [CrossRef]
Sillito A. M. Grieve K. L. Jones H. E. Cudeiro J. Davis J. (1995). Visual cortical mechanisms detecting focal orientation discontinuities. Nature, 378, 492–496. [CrossRef] [PubMed]
Solomon J. A. (2002). Noise reveals visual mechanisms of detection and discrimination. Journal of Vision, 2 (1): 7, 105–120, http://www.journalofvision.org/content/2/1/7, doi:10.1167/2.1.7. [PubMed] [Article] [CrossRef]
Solomon J. A. Morgan M. J. (2000). Facilitation from collinear flanks is cancelled by non-collinear flanks. Vision Research, 40, 279–286. [CrossRef] [PubMed]
Spehar B. Clifford C. W. G. Agostini T. (2002). Induction in variants of White's effect: Common or separate mechanisms? Perception, 31, 189–196. [CrossRef] [PubMed]
Toet A. Levi D. M. (1992). The two-dimensional shape of spatial interaction zones in the parafovea. Vision Research, 32, 1349–57. [CrossRef] [PubMed]
Walker G. A. Ohzawa I. Freeman R. D. (1999). Asymmetric suppression outside the classical receptive field of the visual cortex. Journal of Neuroscience, 19, 10536–10553. [PubMed]
Westheimer G. (1990). Simultaneous orientation contrast for lines in the human fovea. Vision Research, 30, 1913–1921. [CrossRef] [PubMed]
White M. (1979). A new effect of pattern on perceived lightness. Perception, 8, 413–416. [CrossRef] [PubMed]
Wong E. M. Y. Roeber U. Freeman A. W. (2010). Lengthy suppression from similar stimuli during rapid serial visual presentation. Journal of Vision, 10 (1): 14, 1–12, http://www.journalofvision.org/content/10/1/14, doi:10.1167/10.1.14. [PubMed] [Article] [CrossRef] [PubMed]
Figure 1
 
Sample stimuli and the two types of analysis performed. (a) Examples of vertical and horizontal tilt illusion stimuli. (b) A new bipolar mask defined by the zero crossings in low-pass filtered noise is generated on each trial. (c) The black and white sections of the mask contain oppositely oriented gratings and the target (horizontal or vertical) grating is located in the center of the stimulus. (d) Masks that led to the observer responding “CCW” are summed and subtracted from the sum of masks that led the observer to respond “CW,” resulting in a hotspots map where pixel intensity is proportional to strength of influence. (e) Example of the conventional White's illusion and (f) the masked version of White's illusion. (g) Masks on the left side that led to the observer reporting the left as appearing lighter were summed together and polarity-inverted masks on the right side that were reported to appear lighter were summed together. The compiled “left lighter” and “right lighter” images were added into a composite hotspots map where pixel intensity is proportional to influence.
Figure 1
 
Sample stimuli and the two types of analysis performed. (a) Examples of vertical and horizontal tilt illusion stimuli. (b) A new bipolar mask defined by the zero crossings in low-pass filtered noise is generated on each trial. (c) The black and white sections of the mask contain oppositely oriented gratings and the target (horizontal or vertical) grating is located in the center of the stimulus. (d) Masks that led to the observer responding “CCW” are summed and subtracted from the sum of masks that led the observer to respond “CW,” resulting in a hotspots map where pixel intensity is proportional to strength of influence. (e) Example of the conventional White's illusion and (f) the masked version of White's illusion. (g) Masks on the left side that led to the observer reporting the left as appearing lighter were summed together and polarity-inverted masks on the right side that were reported to appear lighter were summed together. The compiled “left lighter” and “right lighter” images were added into a composite hotspots map where pixel intensity is proportional to influence.
Figure 2
 
Orientation hotspots maps obtained for two different mask sizes. Sample stimuli are shown on the left for a horizontal target (top) and a vertical target (bottom). Observers' hotspots maps are in the middle panels; the top two rows were obtained using a mask generated with a filter cutoff of 6 c/image (0.67 c/°), the bottom three rows were obtained with a mask generated with a filter cutoff of 8 c/image (0.90 c/°). Three conditions were tested: (a) a vertical condition, (b) a horizontal condition, (c) the difference map between vertical and horizontal conditions, and (d) a control condition where the surround gratings were ± 60° from the target orientation and produced no tilt illusion (horizontal target for IM and RM, vertical target for EG and GV). Significant pixels are shown in the right-hand panels of the figure for the corresponding conditions. The bottom row shows the summed hotspots maps (observers IM, RM, EG) for the vertical condition (left panels) and the horizontal condition (right panels). Summed pixels maps are the sum of pixel values within one wedge of a circular-edged bow tie (bandwidth 30°) using two different circle radii to create the bow tie (color of data points indicates the radius used as shown on the hotspots maps).
Figure 2
 
Orientation hotspots maps obtained for two different mask sizes. Sample stimuli are shown on the left for a horizontal target (top) and a vertical target (bottom). Observers' hotspots maps are in the middle panels; the top two rows were obtained using a mask generated with a filter cutoff of 6 c/image (0.67 c/°), the bottom three rows were obtained with a mask generated with a filter cutoff of 8 c/image (0.90 c/°). Three conditions were tested: (a) a vertical condition, (b) a horizontal condition, (c) the difference map between vertical and horizontal conditions, and (d) a control condition where the surround gratings were ± 60° from the target orientation and produced no tilt illusion (horizontal target for IM and RM, vertical target for EG and GV). Significant pixels are shown in the right-hand panels of the figure for the corresponding conditions. The bottom row shows the summed hotspots maps (observers IM, RM, EG) for the vertical condition (left panels) and the horizontal condition (right panels). Summed pixels maps are the sum of pixel values within one wedge of a circular-edged bow tie (bandwidth 30°) using two different circle radii to create the bow tie (color of data points indicates the radius used as shown on the hotspots maps).
Figure 3
 
Orientation hotspots maps for observer IM at four different mask sizes (ordinate) and three different target sizes (abscissa) in the vertical and horizontal conditions. Significant pixels are shown next to the maps.
Figure 3
 
Orientation hotspots maps for observer IM at four different mask sizes (ordinate) and three different target sizes (abscissa) in the vertical and horizontal conditions. Significant pixels are shown next to the maps.
Figure 4
 
Lightness (White's illusion) hotspots maps and predictions. (A) White's illusion hotspots maps obtained by summing the masks (left) and significant pixels (p = 0.0001; right) in four observers, obtained using a vertical bar target with a filter cutoff of 12 c/image (2.73 c/°) to create windows onto the background. (B) Predictive hotspots maps were built by convolving the estimated point spread function of the filter used to create the mask with one of two configurations: assimilation only along the sides (a) and contrast only along the ends (c), resulting in predictive hotspots map for assimilation only (d) and contrast only (f). The relative contributions of the assimilation and contrast maps were varied to construct a set of combinatorial predictive maps (equal contribution shown in [e]). (C) Combinatorial predictive maps were correlated with each observer's hotspots maps data to determine which ratio of assimilation to contrast produced the greatest correlation, zero is pure contrast and one is pure assimilation (g). For all observers, the addition of contrast influences did not increase the correlations obtained from assimilation only.
Figure 4
 
Lightness (White's illusion) hotspots maps and predictions. (A) White's illusion hotspots maps obtained by summing the masks (left) and significant pixels (p = 0.0001; right) in four observers, obtained using a vertical bar target with a filter cutoff of 12 c/image (2.73 c/°) to create windows onto the background. (B) Predictive hotspots maps were built by convolving the estimated point spread function of the filter used to create the mask with one of two configurations: assimilation only along the sides (a) and contrast only along the ends (c), resulting in predictive hotspots map for assimilation only (d) and contrast only (f). The relative contributions of the assimilation and contrast maps were varied to construct a set of combinatorial predictive maps (equal contribution shown in [e]). (C) Combinatorial predictive maps were correlated with each observer's hotspots maps data to determine which ratio of assimilation to contrast produced the greatest correlation, zero is pure contrast and one is pure assimilation (g). For all observers, the addition of contrast influences did not increase the correlations obtained from assimilation only.
Figure 5
 
Hotspots and significant pixels maps (p < 0.005) obtained for direction repulsion. The stimulus contained a vertical upwards drifting target of fractal noise (1/f amplitude spectrum) with windows of CW and CCW drifting fractal noise defined by the zeros crossings of noise with a filter cut-off frequency of 16 c/image (1.04 c/°).
Figure 5
 
Hotspots and significant pixels maps (p < 0.005) obtained for direction repulsion. The stimulus contained a vertical upwards drifting target of fractal noise (1/f amplitude spectrum) with windows of CW and CCW drifting fractal noise defined by the zeros crossings of noise with a filter cut-off frequency of 16 c/image (1.04 c/°).
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×