Free
Article  |   September 2011
Higher order texture statistics impair contrast boundary segmentation
Author Affiliations
Journal of Vision September 2011, Vol.11, 14. doi:https://doi.org/10.1167/11.10.14
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Elizabeth Arsenault, Ahmad Yoonessi, Curtis Baker; Higher order texture statistics impair contrast boundary segmentation. Journal of Vision 2011;11(10):14. https://doi.org/10.1167/11.10.14.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Texture boundary segmentation is conventionally thought to be mediated by global differences in Fourier energy, i.e., low-order texture statistics. Here, we have examined the importance of higher order statistical structure of textures in a simple second-order segmentation task. We measured modulation depth thresholds for contrast boundaries imposed on texture samples extracted from natural scene photographs, using forced-choice judgments of boundary orientation (left vs. right oblique). We compared segmentation thresholds for contrast boundaries whose constituent textures were either intact or phase scrambled. In the intact condition, all the texture statistics were preserved, while in the phase-scrambled condition the higher order statistics of the same texture were randomized, but the lower order statistics were unchanged. We found that (1) contrast boundary segmentation is impaired by the presence of higher order statistics; (2) every texture shows impairment but some substantially more than others; and (3) our findings are not related to scrambling-induced changes in detectability. The magnitude of phase-scrambling effect for individual textures was uncorrelated with variations in their amplitude spectra, but instead we suggest that it might be related to differences in local edge structure or sparseness.

Introduction
Our rich perceptual experience of the shapes, objects, and surfaces that make up the visual world relies on successful segmentation of distinct regions in an image to delineate the boundaries between them. The visual system can detect boundaries defined by changes in a number of properties, commonly divided into two categories: those that can be distinguished based on a point-to-point comparison of simple intensive properties such as luminance or color (first order) and those that require two-stage processing to distinguish, such as orientation, spatial frequency, or contrast of textures (second order). Processing of these first- and second-order boundaries is widely thought to be mediated by distinct mechanisms (e.g., Allard & Faubert, 2007; Schofield & Georgeson, 1999). First-order processing is relatively well modeled in terms of linear Gabor-like spatial filters that ostensibly represent V1 receptive fields. Second-order boundaries are inherently more complex, and how they are segmented has been a continuing subject of investigation. 
We use the term “segmentation” not to refer to a specific task but to refer to the process by which the visual system detects second-order boundaries. In this paper, we examine second-order vision through its simplest manifestation: contrast boundary segmentation. It is well known that contrast boundary segmentation performance depends on some of the properties of the texture over which the contrast gradient is defined, i.e., the carrier. In particular, carrier orientation orthogonal to the contrast boundary facilitates contrast boundary detection at low spatial frequencies (Dakin & Mareschal, 2000), and higher spatial frequency carriers have been found to show an advantage as well (Dakin & Mareschal, 2000; Sutter, Sperling, & Chubb, 1995). However, with relatively broadband noise, spatial frequency content was found to have little impact on detection (Schofield & Georgeson, 2003). These studies were restricted to simple filtered noise carriers, and the full extent to which a texture's appearance is relevant to the operation of second-order mechanisms remains to be seen. In this paper, we address this issue by imposing contrast modulations on textures sampled from natural images to begin our examination of the importance of a wide range of statistical structure on second-order vision. 
It has long been clear that only a limited subset of a texture's properties are used by the visual system to segment it from another texture. A classic demonstration of this is our inability to segment pairs of textures whose elements are readily discriminated—for example, upright and inverted chevrons (Olson & Attneave, 1970) or Ts and Ls (Beck, 1966; Bergen & Julesz, 1983). Such observations led naturally to the idea that textures should be thought of in fundamentally statistical terms and that their segmentation is based on a representation in which only some image statistics are preserved. In the last two decades, most of the work on mechanisms of texture segmentation has been couched in terms of two-stage filtering models (Bergen & Adelson, 1988; Landy & Graham, 2004) that can be thought of as comparing the global Fourier energy across a boundary. These models have only been evaluated using simple synthetic textures, and it is unclear how adequately such models can account for human segmentation of textures, and boundaries defined over textures, that contain a wider variety of local features. Texture segmentation is an important example of the emerging general idea that many of our perceptual abilities seem to be based not on a perfect translation of the retinal image but on “summary statistics”, a compressed statistical representation in which only some attributes of the retinal image are retained (Chong & Treisman, 2003; Rosenholtz, 2011). There is evidence that such a representation is automatically and pre-attentively computed (Oliva & Torralba, 2001, 2007), and it appears as though we make some judgments based only on a subset of the statistics available in the image (Alvarez & Oliva, 2008; Ariely, 2001; Chong & Treisman, 2003). In at least some contexts, such as peripheral vision (Balas, Nakano, & Rosenholtz, 2009), a statistical summary of the information in the stimulus may be more relevant to perception than the stimulus itself. Finding the most appropriate summary statistics for a given task is both informative about the mechanisms involved and important to consider when evaluating the results of past studies or designing stimuli for future experiments. 
In this work, we employ a particularly useful and popular way of describing image statistics using a Fourier decomposition of the image. We can distinguish between lower and higher order statistics based on the Fourier power spectrum and phase spectrum. The lower order statistics represented in the power spectrum describe the global energy present in the image: luminance, contrast, spatial frequency, and orientation. The phase spectrum embodies higher order statistics that describe the spatial distribution of that energy (Oppenheim & Lim, 1981; Piotrowski & Campbell, 1982; Thomson & Foster, 1997). For example, step edges in luminance occur when Fourier components of the same orientation over a range of spatial frequencies are phase-aligned in their zero-crossings; such broadband edges are considered to be of particular interest in statistics of natural images (Olshausen & Field, 1996; Thomson, 1999). 
While randomizing an image's phase structure will severely handicap identification of image content (Hansen & Hess, 2007), some of the textural aspects of the image's appearance are preserved—particularly for textures with a high degree of regularity (Emrith, Chantler, Green, Maloney, & Clarke, 2010). While some aspects of overall texture and shading may be captured by the power spectrum (Tadmor & Tolhurst, 1993), several studies have shown that phase spectral information contributes to human perception of isolated textures. Kingdom, Hayes, and Field (2001) manipulated parameters of synthetic micropattern textures to modify their contrast, skew and kurtosis—they found that human observers could most efficiently discriminate textures differing only in their fourth-order statistics (kurtosis). Demonstrations of texture synthesis (Portilla & Simoncelli, 2000) showed that a variety of higher order statistics are required to capture a texture's appearance when they are attentively examined, though evidently only a subset of these higher order statistics are necessary for pre-attentive discrimination of textures (Balas, 2006). Motoyoshi and Kingdom (2010) demonstrated that discrimination of random paired-Gabor textures was enhanced by a co-circular relationship between nearby orientational structures. 
Even though information in the phase spectrum is critical to higher level tasks such as texture appearance judgments and can aid the discrimination of one texture from another, its relevance to a pre-attentive, low-level task such as texture segmentation remains unclear. The popular conception of an energy model of segmentation emphasizes global comparisons of lower order statistics, but other models have been based on different sets of statistics, some of which are higher order. Julesz (1962) conjectured that texture segmentation mechanisms might operate on only a subset of available statistics, i.e., the relationship between the luminance values of any two pixels at a given distance from one another. This theory was later expanded to include relationships between triplets (Julesz, Gilbert, & Victor, 1978) and quadruplets (Julesz, 1981) of pixels. Graham, Sutter, and Venkatesan (1993) demonstrated element arrangement patterns created with oriented Gabor patches that can be readily segmented along boundaries defined only by differences in the relative positions of the texture elements, implying a mechanism that is sensitive to phase information. To achieve human-like segmentation in natural scenes by a computer vision algorithm, Martin, Fowlkes, and Malik (2004) and later Arbeláez, Maire, Fowlkes, and Malik (2011) made use of higher order texture statistics along with other boundary cues. They classified each pixel of an image as belonging to one of a small collection of “textons” based on the responses of a range of co-localized oriented filters, followed by a second stage operator that compares the texton histograms on opposing sides of a putative boundary. Phase scrambling would remove the spatial co-localization of filter responses that define these textons, and so texton-based segmentation would be impossible. Thus, there is evidence suggesting that higher order statistics influence segmentation, but a systematic study is difficult because what constitutes a “higher order statistic” is unbounded and defined only by exclusion to consist of anything that is not a lower order statistic. In this work, we use natural image photographs to sample higher order texture statistics that are likely to be critical to ecological vision and explore the relationship between these statistical regularities and human performance on a texture segmentation task. 
The texture statistics most ecologically relevant to segmentation are those occurring on either side of boundaries that occur in natural images. However, photographs of natural texture boundaries would make poor experimental stimuli for a number of reasons: (1) the texture boundaries in natural images most often arise from occlusions of one object by another, which typically are accompanied by coincident luminance changes and, therefore, are not purely second order (Johnson & Baker, 2004); (2) experimentally manipulating the textures on either side of a boundary is problematic without affecting the boundary itself; and (3) boundaries in images from natural scenes (excluding man-made structures) are rarely straight, further complicating the preceding difficulty. Instead, we approach the problem with photographs of natural textures, which we can individually manipulate and use as carrier patterns to construct synthetic envelope boundaries. This semi-natural approach gives us the same access to the higher order texture statistics that are present in photographs of the real world, while affording the benefits of using synthetic boundaries: experimentally controllable texture statistics and a consistent boundary shape without luminance artifacts. 
We explore segmentation of boundaries defined by contrast differences imposed across individual textures rather than segmentation of a boundary between two distinct textures, for two reasons. First, contrast gradients are the simplest form of texture boundary and, thus, more amenable to analysis. Second, this approach allows us to deal with individual textures one at a time, affording a better opportunity to investigate the effects of individual differences in texture statistics. 
Note that most previous studies of higher order texture statistics and segmentation have explored whether it was possible to segment boundaries defined by differences in these statistics, such that they were necessary to do the task (e.g., Julesz et al., 1978). On the other hand, in these experiments, the higher order texture statistics are, in principle, irrelevant to the task; instead, we ask whether their presence facilitates or impairs segmentation performance. 
To evaluate the role of higher order carrier statistics in contrast boundary segmentation, we look at psychophysical performance under two conditions: natural textures with all the statistics preserved (“intact” condition) or phase-scrambled versions of the same natural textures in which the higher order statistics have been randomized but the lower order statistics remain the same (“scrambled” condition). If the power spectrum provides the basis of segmentation, we would expect to find no differences in psychophysical performance between the intact and scrambled conditions. If any higher order information is utilized by the visual system in this task, we would expect to see performance impaired in the scrambled condition. On the other hand, the boundary might be obscured by higher order information in the texture, in which case we would expect improved performance in the scrambled condition. 
General methods
Stimuli
The natural textures used in this experiment were acquired from high-resolution photographs (3888 × 2592 pixels) taken in a variety of locations such as parks, beaches, and botanical gardens. A digital SLR camera (Canon Digital Rebel XTi) was used to take the photos in RAW format with a linear gamma profile, which were then converted to 16-bit TIFF and imported into Matlab. From each of these photographs, we manually extracted candidate texture regions of 480 × 480 pixel squares. 
The candidate images were then screened subjectively by the authors to evaluate the extent to which they exemplified key characteristics of “texture”: uniformity of lightness, contrast, and granularity (Bergen, 1991; Kingdom et al., 2001; Portilla & Simoncelli, 2000; Wilkinson, 1990; Wilkinson & Wilson, 1998). We used these characteristics to define our acceptance criteria for textures as images that appeared to be relatively uniform and composed predominantly of a single type of material (such as grass, bark, or ripples in sand) or a homogeneous mixture of materials (e.g., branches and leaves). We also required the detail of the texture to be in focus and free of prominent segmentable objects. Textures of man-made materials such as bricks, concrete, or tiles were excluded. Examples of textures excluded in this stage are shown in (Figure 1A, top). 
Figure 1
 
Examples of (A) excluded and (B) included natural textures. (A, top) Images that were excluded due to a subjective judgment that they were not sufficiently uniform, homogeneous, in focus, or contained prominent segmentable objects. (A, bottom) Images that were excluded during computer screening due to inhomogeneity of luminance or contrast between two or more quadrants. (B) Images that were included in the texture corpus.
Figure 1
 
Examples of (A) excluded and (B) included natural textures. (A, top) Images that were excluded due to a subjective judgment that they were not sufficiently uniform, homogeneous, in focus, or contained prominent segmentable objects. (A, bottom) Images that were excluded during computer screening due to inhomogeneity of luminance or contrast between two or more quadrants. (B) Images that were included in the texture corpus.
The textures that passed the subjective screenings were converted to grayscale (using the Matlab function rgb2gray) and further screened objectively for internal homogeneity by comparing the luminance and RMS contrast (Bex & Makous, 2002; Kingdom et al., 2001) of four quadrants of the texture. If there were any pairwise differences greater than 3 dB, the texture was excluded (Figure 1A, bottom). Approximately 64% of the hand-selected textures passed this test, providing a database of 239 natural texture images. Four examples of these textures are displayed in Figure 1B
The stimuli for all of our experiments used the textures from this database as carrier patterns. Texture stimuli for the baseline (“intact”) condition were created using the texture as described above, to measure segmentation with all the higher and lower order statistics present in grayscale natural photographs. In the second (“scrambled”) condition, we phase scrambled the intact texture to remove the higher order statistics. We created scrambled textures by applying a Fourier transform to both the intact texture and a white noise image of the same size. The phase values in the natural texture were replaced with those of the white noise and inverse-transformed, thus leaving the power spectrum unchanged while completely randomizing the phases (Dakin, Hess, Ledgeway, & Achtman, 2002). 
To create the carriers, each texture was scaled to have a mean value of 0, and its extreme values were clipped at ±3 standard deviations and scaled to fit in the range of intensities between ±1.0. This texture carrier was then modulated by an envelope pattern to create a synthetic contrast boundary. For our envelope, we used a half-disk pattern with an oblique orientation boundary, graduated over 20% of the image width with a cosine taper (Figure 2). The final stimulus, S x,y , is the product of the stimulus window, W x,y , the carrier, C x,y , and the envelope, E x,y , scaled by the modulation depth, m: 
S x , y = L 0 { 1 + c C x , y W x , y ( ( 1 + m E x , y ) / 2 ) } ,
(1)
where ∣C x,y ∣ ≤ 1.0, ∣E x,y ∣ ≤ 1.0, and 0 ≤ W x,y ≤ 1. L 0 is the mean luminance, m is the modulation depth, and c is a contrast scaling factor that is adjusted to produce the desired RMS contrast. 
Figure 2
 
Examples of the stimuli used to determine modulation depth thresholds in Experiments 13 shown at three modulation depths (top to bottom: 75, 50, and 32). The envelope is a left- or right-oblique half-disk contrast modulation applied to an (left) intact or (right) phase-scrambled natural texture.
Figure 2
 
Examples of the stimuli used to determine modulation depth thresholds in Experiments 13 shown at three modulation depths (top to bottom: 75, 50, and 32). The envelope is a left- or right-oblique half-disk contrast modulation applied to an (left) intact or (right) phase-scrambled natural texture.
We used these stimuli to measure threshold values of modulation depth (m) or carrier contrast, for intact (Figure 2, left) and phase-scrambled (Figure 2, right) natural textures using an envelope orientation judgment (±45 deg) in a two-alternative forced-choice task. We presented the stimuli at a suprathreshold contrast in all experiments unless otherwise specified. 
To prevent observers from performing the task by monitoring the contrast of only one quadrant of the texture, the phase of the envelope was randomly shifted 180 degrees from trial to trial. To further diversify the stimulus appearance and impair observers' ability to learn and use specific texture features, carrier textures were randomly flipped vertically and/or horizontally on each trial, prior to applying the contrast envelope. 
The stimuli were presented on a CRT monitor (Sony Trinitron Multiscan G400, 81 cd/m2, 75 Hz, 1024 × 768 pixels), gamma-linearized with a digital video processor (Bits++, Cambridge Research Systems) that allowed us to present low-contrast stimuli without binarizing artifacts by increasing the bit depth from 8 to 14 bits. Stimulus patterns appeared in a central 480 × 480 pixel patch on a mean gray background. Observers viewed the stimuli from a distance of 114 cm, resulting in a stimulus visual angle of approximately 6.5 degrees. The experiments were run on a Macintosh (Desktop Pro, MacOSX) using Matlab and PsychToolbox (Brainard, 1997; Pelli, 1997). 
Task
At the beginning of each trial, observers were presented with a central fixation point and used a button press to initiate each 100-ms stimulus presentation. The envelope boundary was oriented 45 degrees either left or right oblique, and observers indicated with a button press the perceived orientation of the boundary. Feedback was not provided as a precaution against aiding spurious cue learning. The screen was maintained at the mean gray background between stimulus presentations. 
We measured thresholds using a method of constant stimuli with five logarithmically spaced level values, chosen to span an appropriate range as determined from pilot experiments for each observer. A minimum of three blocks of 100 trials, with 20 trials per level, was run for each condition to yield a total of at least 60 trials per level. Percent-correct data from a total of 600 trials were fit with a logistic function, and a threshold was interpolated for 75% correct. Curve fitting was performed by the statistics package Prism (GraphPad Software), and standard error measurements were estimated with its bootstrapping algorithm. 
We tested the significance of the difference between thresholds in the intact and scrambled conditions using a two-tailed paired-samples t-test with a criterion α = 0.05 and measured the effect size (Kline, 2005) using Cohen's d with the standardizer s computed as 
s = σ 1 2 + σ 2 2 / 2 ,
(2)
where σ 1 and σ 2 are the sample standard deviations of the compared conditions. 
Experiment 1
This experiment examined whether an observer's ability to segment contrast boundaries is affected by higher order statistics of carrier textures drawn from a large sample of texture appearances. This was accomplished by comparing modulation depth thresholds for contrast boundaries with natural texture (“intact”) carriers and those with phase-scrambled (“scrambled”) carriers. 
Methods
To obtain a general picture of the contribution of higher order statistics to segmentation, we measured modulation depth thresholds for the texture library as a whole. On each trial, a carrier texture was selected from the database randomly without replacement within each block of 100 trials. At a suprathreshold carrier RMS contrast of 14.5%, we measured modulation depth thresholds for each observer in the intact and scrambled conditions. We collected data from four experienced psychophysical observers, three of whom (JB, AM, JH) were naive to the hypotheses of the experiment. 
Results
The results for Experiment 1 are shown in Figure 3. Modulation depth thresholds for phase-scrambled textures (light bars) are substantially lower than those for intact textures (shaded bars) for each observer. We found a large, statistically significant effect of phase scrambling (t(3) = 14.71, p < 0.05, d = 2.86) with the average observer's intact threshold 2.36 dB above their scrambled threshold. These results not only suggest that the presence of higher order statistics in natural textures is a relevant factor in performance on this task but also that segmentation improves when higher order statistics are removed. 
Figure 3
 
Modulation depth threshold results from Experiment 1 for four observers for intact and scrambled texture conditions. Thresholds were lower for the phase-scrambled textures (light bars) than for the intact textures (shaded bars). Error bars represent ±1 standard error.
Figure 3
 
Modulation depth threshold results from Experiment 1 for four observers for intact and scrambled texture conditions. Thresholds were lower for the phase-scrambled textures (light bars) than for the intact textures (shaded bars). Error bars represent ±1 standard error.
Experiment 2
In the previous experiment, we observed a difference in thresholds for intact and phase-scrambled textures as ensembles, providing evidence for a role for higher order texture statistics in boundary segmentation. However, since our textures vary widely in appearance, it is unclear to what extent our result is uniformly representative across textures or if some textures demonstrate a greater effect of phase scrambling than others. In this experiment, by comparing modulation depth thresholds for individual intact and phase-scrambled textures, we aimed to determine what effect individual differences in texture appearance have on modulation depth thresholds of contrast boundaries. 
Methods
This experiment was conducted in the same manner as Experiment 1 in almost every respect. However, rather than randomly selecting textures on each trial, modulation depth thresholds were measured in separate blocks for each of twenty individual textures chosen to span a wide range of appearances and represent a variety of scales, materials, and environments. For each threshold measurement, a single texture was used on every stimulus presentation, so that modulation depth thresholds, and therefore any difference between the intact and phase-scrambled conditions, could be assessed separately for each texture. 
A modulation depth threshold was determined for each texture in the intact and scrambled conditions. Data were collected for three observers, two of whom (JH and AM) were naive to the hypotheses of the experiment. 
Results
The results from this experiment are shown in Figure 4, where each symbol indicates the thresholds for the scrambled versus the intact conditions for a particular texture. The dashed line indicates the 1:1 ratio between the two thresholds, which is where we would expect the data to fall if there were no effect of phase scrambling. The thresholds for all textures tested fall below the 1:1 line, indicating that the intact thresholds are higher than the scrambled thresholds, in agreement with the results from Experiment 1. On average, intact thresholds are 2.25 dB (SD = 0.84 dB) higher than scrambled thresholds for observer LA, 2.48 dB (SD = 1.08 dB) higher for JH, and 2.44 dB (SD = 1.29 dB) higher for AM. Overall, thresholds for all subjects show a substantial, statistically significant reduction after the carrier is phase scrambled: LA, t(19) = 8.46, p < 0.05, d = 2.00; AM, t(19) = 4.89, p < 0.05, d = 2.00; and JH, t(19) = 7.3, p < 0.05, d = 1.99. 
Figure 4
 
Modulation depth threshold results from Experiment 2 for three observers. Each symbol plots the phase-scrambled versus the intact threshold for a particular texture. In almost all 20 textures tested, for all three observers, the symbols lie below the 1:1 line (dashed), indicating that the intact threshold is higher than the phase-scrambled threshold. The amount of reduction, or the distance from the 1:1 line, is texture dependent. Error bars show the standard error on each measurement.
Figure 4
 
Modulation depth threshold results from Experiment 2 for three observers. Each symbol plots the phase-scrambled versus the intact threshold for a particular texture. In almost all 20 textures tested, for all three observers, the symbols lie below the 1:1 line (dashed), indicating that the intact threshold is higher than the phase-scrambled threshold. The amount of reduction, or the distance from the 1:1 line, is texture dependent. Error bars show the standard error on each measurement.
From the scatter plots in Figure 4, it is apparent that while thresholds for all textures are affected by phase scrambling to some extent, some thresholds are reduced substantially more than others. One contributing factor appears to be the magnitude of the intact threshold; textures with higher intact thresholds seem to show more reduction than those with lower intact thresholds. A one-tailed Spearman correlation shows a significant, positive correlation between the intact threshold and the threshold reduction in decibels: LA, r(20) = 0.72, p < 0.05; AM, r(20) = 0.60, p < 0.05; and JH, r(20) = 0.84, p < 0.05. Thus, the textures that are more difficult on the segmentation task are the ones that benefit most from phase scrambling. 
To get an idea of what specific texture attributes might contribute to the differing thresholds, we sorted the textures into a histogram (Figure 5) based on the threshold change for each texture averaged across the three observers. The textures that have a small effect of phase scrambling tend to be made up of densely packed, smaller features or markings, while the textures that showed a large effect of phase scrambling tend to be composed of larger elements with longer continuous contours. 
Figure 5
 
Histogram of texture carriers used in Experiment 2, based on average magnitude of the effect of phase scrambling. The textures that show a larger change (>2 dB) tend to have more prominent edges and appear to be more sparse.
Figure 5
 
Histogram of texture carriers used in Experiment 2, based on average magnitude of the effect of phase scrambling. The textures that show a larger change (>2 dB) tend to have more prominent edges and appear to be more sparse.
Experiment 3
In the previous experiments, the textures were all equated for RMS contrast, and this metric (like other low-order image statistics) is preserved after phase scrambling. Nevertheless, it is conceivable that our results could be explained by systematic differences in detectability between the intact and phase-scrambled texture conditions. If the scrambled textures were easier to detect than their intact counterparts, they might be at an advantage in the contrast boundary segmentation task. Here, the same task was undertaken as before on a representative subset of the textures from Experiment 2 but using stimuli constructed from textures at fixed contrast increments above their individually measured detection thresholds. 
Methods
In this experiment, two thresholds were determined in separate blocks for each condition: first, the carrier contrast threshold, and then the modulation depth threshold. As before, “modulation depth” (m in Equation 1) refers to the extent to which the envelope, in this case a contrast change, is applied. “Carrier contrast” refers to the RMS contrast level of the unmodulated carrier. 
We measured carrier contrast thresholds using a method of constant stimuli for each condition, texture, and observer. Five logarithmically spaced carrier contrast levels were tested at a modulation depth of 100% (Figure 6) using the same left- or right-oblique segmentation task. Then, to compare modulation depth thresholds as directly as possible, we presented the stimuli at 6 dB above each observer's carrier contrast threshold for that particular texture. We tested eight natural textures from the previous subset of twenty for this experiment. Carrier contrast and then modulation depth thresholds were measured for two observers, one of whom (JB) was naive to the purposes of the experiment. 
Figure 6
 
Stimuli used to determine carrier contrast thresholds in Experiment 3 shown at a range of carrier contrasts (top to bottom: 8, 5, and 3% RMS contrast), all with 100% modulation depth. Thresholds were determined for (A) intact and (B) scrambled textures.
Figure 6
 
Stimuli used to determine carrier contrast thresholds in Experiment 3 shown at a range of carrier contrasts (top to bottom: 8, 5, and 3% RMS contrast), all with 100% modulation depth. Thresholds were determined for (A) intact and (B) scrambled textures.
Results
The carrier contrast threshold results are shown in Figure 7, where each symbol indicates the scrambled and intact thresholds for a particular texture. The points fall very close to the equality line, suggesting that there is no systematic effect of phase scrambling on detectability. We found no statistically significant differences between the carrier contrast thresholds of intact and scrambled textures for either observer LA (t(7) = 0.412, p > 0.05) or JB (t(7) = 2.038, p > 0.05). We also found relatively little variability between textures; the axes illustrated in Figure 7 span a range of only one octave, compared with a four-octave range illustrated in Figure 4. This finding of very similar detection thresholds for different RMS contrast-equated textures, whether intact or scrambled, is consistent with the report of Bex and Makous (2002) that RMS contrast provides a good contrast metric for natural images. 
Figure 7
 
Carrier contrast threshold results from Experiment 3 for two observers. Each symbol plots the phase-scrambled vs. the intact threshold for a particular texture. The dashed line indicates where a texture's intact and phase-scrambled thresholds correspond exactly. Carrier contrast thresholds are centered on the 1:1 line, suggesting no systematic change in detectability when a texture is phase scrambled. Variation in thresholds between observers, textures, and conditions is minimal.
Figure 7
 
Carrier contrast threshold results from Experiment 3 for two observers. Each symbol plots the phase-scrambled vs. the intact threshold for a particular texture. The dashed line indicates where a texture's intact and phase-scrambled thresholds correspond exactly. Carrier contrast thresholds are centered on the 1:1 line, suggesting no systematic change in detectability when a texture is phase scrambled. Variation in thresholds between observers, textures, and conditions is minimal.
Modulation depth thresholds for the detectability-equated intact and phase-scrambled textures are shown in Figure 8. All points lie below the 1:1 line as in Experiment 2, indicating that thresholds were again lowered following phase scrambling. Comparing the average effect of phase scrambling, we find that intact thresholds are still 2.17 dB (SD = 0.83) higher than scrambled thresholds for observer LA and 2.53 dB (SD = 1.28) higher for observer JB. The difference between the thresholds in the intact and phase-scrambled conditions remains statistically significant for both observers: LA (t(7) = 5.734, p < 0.05, d = 2.59) and JB (t(7) = 4.326, p < 0.05, d = 2.11). Furthermore, the large effect sizes (d) reported here are similar to those found in Experiment 2, as are the average changes in threshold, indicating that the effect observed in Experiments 1 and 2 is not the result of differences in effective RMS contrast for intact and phase-scrambled textures. 
Figure 8
 
Modulation depth threshold results from Experiment 3 for two observers. Each symbol plots the phase-scrambled versus the intact threshold for a particular texture, with each texture a fixed increment above its detection threshold. The dashed line indicates where a texture's intact and phase-scrambled thresholds correspond exactly. Modulation depth thresholds measured with these detectability-equated contrasts are still systematically lower for scrambled than for intact textures.
Figure 8
 
Modulation depth threshold results from Experiment 3 for two observers. Each symbol plots the phase-scrambled versus the intact threshold for a particular texture, with each texture a fixed increment above its detection threshold. The dashed line indicates where a texture's intact and phase-scrambled thresholds correspond exactly. Modulation depth thresholds measured with these detectability-equated contrasts are still systematically lower for scrambled than for intact textures.
Discussion
In this study, we found that the presence of higher order statistics impaired performance on a basic texture segmentation task. In Experiment 1, we used an ensemble of more than 200 natural texture photographs to show that it is more difficult to segment contrast boundaries imposed on intact textures than those imposed on phase-scrambled textures. We extended this result in Experiment 2, showing that this effect occurs in varying degrees for different individual textures. Finally, in Experiment 3, we showed that intact and scrambled textures are about equally detectable and that scaling the carrier contrast to the detection thresholds of individual textures and observers does not eliminate or even reduce the observed effect. Based on these results, we cannot rule out the possibility that some kinds of higher order statistics could contribute positively to segmentation; we simply conclude that whatever help some statistics might contribute, they do not overcome the impairment imposed by other statistics. 
Our finding that higher order information impairs performance runs contrary to what has been found in many non-segmentation tasks such as texture discrimination (Phillips & Todd, 2010), spectral slope discrimination (Thomson & Foster, 1997), and scene recognition (Hansen & Hess, 2007), where higher order statistics improve performance. However, higher order statistics have previously been found to impair the detection of distortions in natural scenes (Bex, 2010). As in the work described here, other studies have found that perception depends on more than simply the presence or absence of higher order statistics; it depends on some statistics more than others, and the degree of their importance varies from image to image for reasons that are not entirely clear (Bex, Solomon, & Dakin, 2009; Hansen & Hess, 2007; Phillips & Todd, 2010). 
In the past, different investigators have considered various kinds of “higher order” statistics—excellent reviews can be found in Kingdom et al. (2001) and Landy and Graham (2004). Julesz et al. (1978) emphasized the importance of considering higher order statistics in segmentation models, but their use of the term is not congruent with the more conventional Fourier-based statistics that we employed in this study. By controlling nth-order correlations, one can create images with identical autocorrelation functions and, therefore, identical Fourier amplitude spectra in an ensemble average (Julesz, 1962; Julesz et al., 1978; Victor, Chubb, & Conte, 2005). The Julesz constraint that ensures that second-order correlations are identical does not preclude individual samples of these populations from differing in their second-order statistics (Chubb & Yellott, 2000; Yellott, 1993). Though Victor (1994) argued that texture statistics, by nature, characterize a population rather than individual samples, segmentation mechanisms have access to only a pair of samples at any given moment and so sample statistics cannot be ignored. Furthermore, these statistics are difficult to examine in the context of the linear filtering models that are prevalent in modern vision theory, because the Julesz statistics are not maintained following linear filtering (Klein & Tyler, 1986). 
Which higher order statistics impair segmentation?
What image statistics might be at the root of our results remains unclear. In the histogram of our stimuli (Figure 5), there is a strong visual impression of a difference in structural appearance between the textures whose thresholds are least affected by phase scrambling (left side) and those most affected by phase scrambling (right side). Some specific apparent differences are relative amounts of high and low spatial frequency information, structural sparseness, and local edge structure, which we will now consider. 
Spectral slope. While we find an effect of higher order statistics on segmentation, the magnitude of the threshold change could be associated with individual differences in the amplitude spectra of the textures. It appears as though the textures whose thresholds are most affected by phase scrambling might have relatively less energy in the high spatial frequencies and proportionately more energy at lower frequencies. This difference in the proportions of low and high spatial frequency information could have an impact on segmentation mechanisms. To assess the relative amounts of high and low spatial frequencies in our stimuli, we measured their spectral slopes by fitting a linear regression to the log–log plot of Fourier amplitude vs. spatial frequency (Bex & Makous, 2002; Thomson & Foster, 1997) for each texture—steeper negative slopes would indicate relatively more energy in the high spatial frequencies. The results are plotted in Figure 9A as a function of the change in threshold between intact and scrambled conditions—note that most of the spectral slopes were close to −1, as expected for natural images (Field, 1987; Ruderman, 1997). There does not appear to be any relationship between threshold change and spectral slope, and a Pearson correlation on these variables failed to find any significant correlation (r(20) = −0.39, p > 0.05). This lack of relationship suggests that an explanation of our results based on relative differences in high vs. low spatial frequencies is unlikely. 
Figure 9
 
Relationship between image statistic indices and the change in segmentation threshold between intact and scrambled conditions in decibels. (A) Slope of falloff of Fourier spectrum. (B) Sparseness, as measured using intensity histogram kurtosis. (C) Sparseness, as measured with the LSSM metric of Hansen and Hess (2007)—a wavelet-based metric developed for natural scenes. (D) Edge density, modified from Bex (2010). Note the lack of relationship between threshold change and kurtosis, LSSM, or slope but a clear correlation between edge density and threshold change.
Figure 9
 
Relationship between image statistic indices and the change in segmentation threshold between intact and scrambled conditions in decibels. (A) Slope of falloff of Fourier spectrum. (B) Sparseness, as measured using intensity histogram kurtosis. (C) Sparseness, as measured with the LSSM metric of Hansen and Hess (2007)—a wavelet-based metric developed for natural scenes. (D) Edge density, modified from Bex (2010). Note the lack of relationship between threshold change and kurtosis, LSSM, or slope but a clear correlation between edge density and threshold change.
Sparseness. Colloquially, structural sparseness can be defined in terms of the amount of “stuff” that appears to be in a texture (Adelson, 2001). A collection of 50 leaves seen close up can be considered a “sparse” texture, while a field of millions of blades of grass seen from a distance appears less sparse. Textures that have been phase scrambled do not appear sparse because there are no local concentrations of energy (e.g., small edges or other texture markings) and complementary regions of blank space forming discrete objects. Upon phase scrambling, the pixel and wavelet distributions become normal (Bex & Makous, 2002) rather than the kurtotic distribution that is a signature of sparseness (Kingdom et al., 2001). Sparseness is well known as a key attribute of natural scenes (Field, 1998; Ruderman, 1997), but it is also a primary property of textures. Victor and Conte (1996) proposed “granularity” as an important higher order distinction between textures, which they investigated using textures formed with a range of element sizes. Computer science and image statistical methods have described “coarseness” as a major dimension along which textures vary (Rubner & Tomasi, 1998). Durgin (1995, 2008) showed that density is a primitive texture feature for which adaptation effects can be measured, and Kingdom et al. (2001) demonstrated that textures can be discriminated based on sparseness. However, none of these previous efforts examined the impact of sparseness on boundary segmentation. 
To measure sparseness, Kingdom et al. (2001) suggested intensity histogram or wavelet-based kurtosis, and Hansen and Hess (2007) developed a wavelet-based measure of kurtosis, the LSSM, to assess the sparseness of natural images. We computed these metrics for each texture and plotted them against the textures' change in threshold (Figures 9B and 9C)—in both cases, there is no systematic relationship, as confirmed by the lack of significant correlation between threshold change and either pixel kurtosis (r(20) = −0.15, p > 0.05) or LSSM (r(20) = −0.07, p > 0.05). These results suggest either that sparseness is not a relevant higher order statistic or that the sparseness metrics we employed are not sufficiently sensitive to sparseness. Note that textures appear relatively dense compared to images of scenes, and it may be that these sparseness metrics perform less well in this context. 
Edge structure. It might not be the global arrangement of the energy (sparseness) that determines the difference in the thresholds but the varying density of broadband features within the image (local edge structures). The textures that were more affected by phase scrambling (Figure 5, right) appear to have more prominent local edges. We assessed the amount of local edge structure using a modified version of the method for computing edge density outlined in Bex (2010). We integrated a Canny edge map (constructed using Matlab's canny edge detector) and normalized by the number of pixels in the texture to obtain an index of edge density. In a plot of this edge density index against the textures' change in threshold (Figure 9D), we can see a systematic relationship: The textures with greatest effect of phase scrambling had higher edge density indices, while those with least effect had lower edge densities. This relationship was confirmed by a significant correlation between this rough measure of edge density and threshold change (r(20) = −0.80, p < 0.05), suggesting that some aspect of both broadband edges and the density in which they occur may impact the effect of higher order image statistics on segmentation performance. However, this result is also consistent with a role for sparseness, since sparser textures would produce smaller indices of edge density. Untangling these factors may be problematic with natural texture photographs, but using synthetic textures, where both sparseness and local edge structure can be controlled, is a clear way forward. 
Higher order statistics impair segmentation performance
Why the presence of higher order statistics might impair the segmentation of contrast boundaries is an open question. 
Camouflage or masking. It could be the case that luminance-defined contours in intact textures camouflage the target boundary—however, luminance noise has little impact on contrast boundary segmentation (Allard & Faubert, 2007), so we do not expect this to influence our results. Aside from luminance variation, sparse images contain low- and high-contrast regions, and this spatial modulation of contrast could mask the modulation that observers are tasked with identifying (Allard & Faubert, 2007). We consider this unlikely for three reasons: (1) the observer knows that the edge will be in one of two positions, so there is very little positional uncertainty; (2) the textures are randomly flipped from trial to trial, so any texture features that happen to appear along the envelope boundary will only affect some (25%) of the trials; and (3) second-order masking is spatial frequency dependent (Hutchinson & Ledgeway, 2004), which suggests that such high spatial frequency contrast noise should not affect performance on our low spatial frequency boundary. Low spatial frequency contrast noise should not present a problem because we specifically excluded textures that were too coarse or had large-scale contrast gradients. However, the precise bandwidths of the noise, boundary, and second-order mechanisms are ill-defined, so while it seems unlikely we cannot rule out the possibility that second-order masking plays a role in determining our results. 
Subjective texture selection. One might argue that including those textures we excluded based on their subjective characteristics could somehow reduce or nullify the observed effect of phase scrambling. Because we find a strong correlation between threshold and Canny edge density, we performed the analysis of Figure 9D on the 49 textures excluded in the subjective stage of texture screening. We found that they have systematically lower edge density (M = 0.060, SD = 0.02) than the 20 we tested in Experiment 2 (M = 0.095, SD = 0.03), and thus, their inclusion would, in fact, have been more likely to increase the size of the effect we observe. 
Edge vs. region processing. Finally, it may be that phase-scrambled and dense images provide more support along the contrast boundary itself and so are easiest to segment. This explanation supposes that the mechanism responsible for segmentation preferentially uses information near the texture boundary (an edge-based process) rather than integrating information throughout the entire stimulus (a region-based process). This is possible but is a departure from the common conception of texture segmentation mechanisms as two-stage filter models with large second-stage filters that can operate across the entire region. In the context of this model, a strong reliance on edge support would be surprising, particularly for contrast modulations. 
The regional summation in the second stage of a standard energy model is not explicitly selective for the local features encoded in higher order statistics, but there are various ways to adapt this model to enable such selectivity. The simplest modification of a standard FRF model would entail changing the non-linear function separating the first- and second-stage filters. Graham and Sutter (1998) found that an expansive power-law non-linearity with an exponent between 3 and 4 better accounted for their findings with element arrangement patterns. By using a non-linearity that is more expansive than a square law, textures with localized areas of high energy (such as sparse natural images) would give a greater response. A more serious modification to the FRF scheme would be to use first-stage filters that act like non-linear “feature detectors”—for example, Martin et al. (2004) used histograms of different types of local features, defined by the co-localizations of wavelet responses, to segment texture boundaries. Finally, one might add an additional non-linear process beyond the second stage. Graham et al. (1993) argued that an additional stage of a filter–rectify cascade (i.e., Filter–Rectify–Filter–Rectify–Filter rather than FRF) was necessary to segment some element arrangement patterns that differed only in their higher order statistics. 
Conclusion
We have shown that texture segmentation mechanisms are sensitive to the information in the phase spectrum of an image. From these findings, we cannot be certain which specific statistics contribute, but it appears as though sparseness and local edge structure in particular might be relevant statistics in this task. To address these issues, we intend to use synthetic stimuli to isolate these specific higher order statistics and to gauge their impact on segmentation independently. It is not yet clear whether current models of texture segmentation account for how humans process higher order statistics, but testing and modifying those models may prove to be informative. A detailed examination of the effects of higher order statistics as an ensemble and as individual exemplars (e.g., sparseness) on segmentation will be useful for refining models to the point where we can begin to apply them to biologically relevant stimuli. We can conclude that while textures are segmented using a limited subset of the information they contain, this subset must be expanded to include higher order statistics in some capacity. 
Acknowledgments
The authors would like to thank Fred Kingdom and Aaron Johnson for helpful discussions and suggestions, Bruce Hansen for sharing his LSSM code, their reviewers for their constructive comments, and their subjects. This work was supported by an NSERC Grant OPG0001978 to CB. 
Commercial relationships: none. 
Corresponding author: Elizabeth Arsenault. 
Address: Royal Victoria Hospital, Room H4.14, 687 Pine Avenue West, Montreal, Quebec H3A 1A1, Canada. 
References
Adelson E. H. (2001). On seeing stuff: The perception of materials by humans and machines. Proceedings of the SPIE, 4299, 1–12. [Article]
Allard R. Faubert J. (2007). Double dissociation between first- and second-order processing. Vision Research, 47, 1129–1141. [PubMed] [CrossRef] [PubMed]
Alvarez G. A. Oliva A. (2008). The representation of simple ensemble visual features outside the focus of attention. Psychological Science, 19, 392–398. [PubMed] [Article] [CrossRef] [PubMed]
Arbeláez P. Maire M. Fowlkes C. Malik J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 896–916. [PubMed] [Article] [CrossRef]
Ariely D. (2001). Seeing sets: Representation by statistical properties. Psychological Science, 12, 157–162. [PubMed] [Article] [CrossRef] [PubMed]
Balas B. Nakano L. Rosenholtz R. (2009). A summary-statistic representation in peripheral vision explains visual crowding. Journal of Vision, 9(12):13, 1–18, http://www.journalofvision.org/content/9/12/13, doi:10.1167/9.12.13. [PubMed] [Article] [CrossRef] [PubMed]
Balas B. J. (2006). Texture synthesis and perception: Using computational models to study texture representations in the human visual system. Vision Research, 46, 299–309. [PubMed] [Article] [CrossRef] [PubMed]
Beck J. (1966). Effect of orientation and of shape similarity on perceptual grouping. Attention, Perception & Psychophysics, 1, 300–302. [CrossRef]
Bergen J. R. (1991). Theories of visual texture perception. In Regan D. (Ed.), Vision and visual dysfunction (vol. 10B, pp. 114–134). New York: Macmillan Press.
Bergen J. R. Adelson E. H. (1988). Early vision and texture perception. Nature, 333, 363–364. [Article] [CrossRef] [PubMed]
Bergen J. R. Julesz B. (1983). Parallel versus serial processing in rapid pattern discrimination. Nature, 303, 696–698. [PubMed] [Article] [CrossRef] [PubMed]
Bex P. J. (2010). (In) Sensitivity to spatial distortion in natural scenes. Journal of Vision, 10(2):23, 1–15, http://www.journalofvision.org/content/10/2/23, doi:10.1167/10.2.23. [PubMed] [Article] [CrossRef] [PubMed]
Bex P. J. Makous W. (2002). Spatial frequency, phase, and the contrast of natural images. Journal of the Optical Society of America A, 19, 1096–1106. [PubMed] [Article] [CrossRef]
Bex P. J. Solomon S. G. Dakin S. C. (2009). Contrast sensitivity in natural scenes depends on edge as well as spatial frequency structure. Journal of Vision, 9(10):1, 1–19, http://www.journalofvision.org/content/9/10/1, doi:10.1167/9.10.1. [PubMed] [Article] [CrossRef]
Brainard D. H. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–436. [Article] [CrossRef] [PubMed]
Chong S. C. Treisman A. (2003). Representation of statistical properties. Vision Research, 43, 393–404. [PubMed] [Article] [CrossRef] [PubMed]
Chubb C. Yellott J. I., Jr. (2000). Every discrete, finite image is uniquely determined by its dipole histogram. Vision Research, 40, 485–492. [PubMed] [CrossRef] [PubMed]
Dakin S. C. Hess R. F. Ledgeway T. Achtman R. L. (2002). What causes non-monotonic tuning of fMRI response to noisy images? Current Biology, 12, R476–R477. [PubMed] [Article] [CrossRef] [PubMed]
Dakin S. C. Mareschal I. (2000). Sensitivity to contrast modulation depends on carrier spatial frequency and orientation. Vision Research, 40, 311–329. [PubMed] [Article] [CrossRef] [PubMed]
Durgin F. H. (1995). Texture density adaptation and the perceived numerosity and density of texture. Journal of Experimental Psychology: Human Perception and Performance, 21, 149–169. [Article] [CrossRef]
Durgin F. H. (2008). Texture density adaptation and visual number revisited. Current Biology, 18, R855–R856. [PubMed] [CrossRef] [PubMed]
Emrith K. Chantler M. J. Green P. R. Maloney L. T. Clarke A. D. F. (2010). Measuring perceived differences in surface texture due to changes in higher-order statistics. Journal of the Optical Society of America A, 27, 1232–1244. [PubMed] [Article] [CrossRef]
Field D. J. (1987). Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America A, 4, 2379–2394. [PubMed] [Article] [CrossRef]
Field D. J. (1998). Visual coding, redundancy, and “feature detection”. In Arbib M. A. (Ed.), The handbook of brain theory and neural networks (pp. 1012–1016). Cambridge, MA: MIT Press.
Graham N. Sutter A. (1998). Spatial summation in simple (Fourier) and complex (non-Fourier) texture channels. Vision Research, 38, 231–257. [PubMed] [CrossRef] [PubMed]
Graham N. Sutter A. Venkatesan C. (1993). Spatial-frequency- and orientation-selectivity of simple and complex channels in region segregation. Vision Research, 33, 1893–1911. [PubMed] [Article] [CrossRef] [PubMed]
Hansen B. C. Hess R. F. (2007). Structural sparseness and spatial phase alignment in natural scenes. Journal of the Optical Society of America A, 24, 1873–1885. [PubMed] [Article] [CrossRef]
Hutchinson C. V. Ledgeway T. (2004). Spatial frequency selective masking of first-order and second-order motion in the absence of off-frequency ‘looking’. Vision Research, 44, 1499–1510. [PubMed] [CrossRef] [PubMed]
Johnson A. P. Baker C. L., Jr. (2004). First- and second-order information in natural images: A filter-based approach to image statistics. Journal of the Optical Society of America A, 21, 913–925. [PubMed] [CrossRef]
Julesz B. (1962). Visual pattern discrimination. IRE Transactions on Information Theory, IT-8, 84–92.
Julesz B. (1981). Textons, the elements of texture perception, and their interactions. Nature, 290, 91–97. [PubMed] [CrossRef] [PubMed]
Julesz B. Gilbert E. N. Victor J. D. (1978). Visual discrimination of textures with identical third-order statistics. Biological Cybernetics, 31, 137–140. [PubMed] [Article] [CrossRef] [PubMed]
Kingdom F. A. A. Hayes A. Field D. J. (2001). Sensitivity to contrast histogram differences in synthetic wavelet textures. Vision Research, 41, 585–598. [Article] [CrossRef] [PubMed]
Klein S. A. Tyler C. W. (1986). Phase discrimination of compound gratings: Generalized autocorrelation analysis. Journal of the Optical Society of America A, 3, 868–878. [PubMed] [Article] [CrossRef]
Kline R. B. (2005). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: American Psychological Association.
Landy M. S. Graham N. (2004). Visual perception of texture. In Chalupa L. M. Werner J. S. (Eds.), The visual neurosciences (pp. 1106–1118). Cambridge, MA: MIT Press.
Martin D. R. Fowlkes C. C. Malik J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 530–549. [PubMed] [Article]
Motoyoshi I. Kingdom F. A. A. (2010). The role of co-circularity of local elements in texture perception. Journal of Vision, 10(1):3, 1–8, http://www.journalofvision.org/content/10/1/3, doi:10.1167/10.1.3. [PubMed] [Article] [CrossRef] [PubMed]
Oliva A. Torralba A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42, 145–175. [Article] [CrossRef]
Oliva A. Torralba A. (2007). The role of context in object recognition. Trends in Cognitive Sciences, 11, 520–527. [PubMed] [Article] [CrossRef] [PubMed]
Olshausen B. A. Field D. J. (1996). Natural image statistics and efficient coding. Network: Computation in Neural Systems, 7, 333–339. [PubMed] [Article] [CrossRef]
Olson R. K. Attneave F. (1970). What variables produce similarity grouping? American Journal of Psychology, 81, 1–21. [Article] [CrossRef]
Oppenheim A. V. Lim J. S. (1981). The importance of phase in signals. Proceedings of the IEEE, 69, 529–541. [CrossRef]
Pelli D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. [PubMed] [CrossRef] [PubMed]
Phillips F. Todd J. T. (2010). Texture discrimination based on global feature alignments. Journal of Vision, 10(6):6, 1–14, http://www.journalofvision.org/content/10/6/6, doi:10.1167/10.6.6. [PubMed] [Article] [CrossRef] [PubMed]
Piotrowski L. N. Campbell F. W. (1982). A demonstration of the visual importance and flexibility of spatial-frequency amplitude and phase. Perception, 11, 337–346. [PubMed] [CrossRef] [PubMed]
Portilla J. Simoncelli E. P. (2000). A parametric texture model based on joint statistics of complex wavelet coefficients. International Journal of Computer Vision, 40, 49–71. [Article] [CrossRef]
Rosenholtz R. (2011). What your visual system sees where you are not looking. In Rogowitz B. E. Pappas T. N. Proceedings of SPIE: Human vision and Electronic Imaging XVI. San Francisco. [Article]
Rubner Y. Tomasi C. (1998). Texture metrics. Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 5, 4601–4607. [Article]
Ruderman D. L. (1997). Origins of scaling in natural images. Vision Research, 37, 3385–3398. [Article] [CrossRef] [PubMed]
Schofield A. J. Georgeson M. A. (1999). Sensitivity to modulations of luminance and contrast in visual white noise: Separate mechanisms with similar behaviour. Vision Research, 39, 2697–2716. [PubMed] [CrossRef] [PubMed]
Schofield A. J. Georgeson M. A. (2003). Sensitivity to contrast modulation: The spatial frequency dependence of second-order vision. Vision Research, 43, 243–259. [PubMed] [CrossRef] [PubMed]
Sutter A. Sperling G. Chubb C. (1995). Measuring the spatial frequency selectivity of second-order texture mechanisms. Vision Research, 35, 915–924. [PubMed] [CrossRef] [PubMed]
Tadmor Y. Tolhurst D. J. (1993). Both the phase and amplitude spectrum may determine the appearance of natural images. Vision Research, 33, 141–145. [PubMed] [CrossRef] [PubMed]
Thomson M. G. A. (1999). Visual coding and the phase structure of natural scenes. Network: Computation in Neural Systems, 10, 123–132. [PubMed] [CrossRef]
Thomson M. G. A. Foster G. H. (1997). Role of second- and third-order statistics in the discriminability of natural images. Journal of the Optical Society of America A, 14, 2081–2090. [Article] [CrossRef]
Victor J. D. (1994). Images, statistics, and textures: Implications of triple correlation uniqueness for texture statistics and the Julesz conjecture: Comment. Journal of the Optical Society of America A, 11, 1680–1684. [Article] [CrossRef]
Victor J. D. Chubb C. Conte M. M. (2005). Interaction of luminance and higher-order statistics in texture discrimination. Vision Research, 45, 311–328. [PubMed] [Article] [CrossRef] [PubMed]
Victor J. D. Conte M. M. (1996). The role of high-order phase correlations in texture processing. Vision Research, 36, 1615–1631. [PubMed] [Article] [CrossRef] [PubMed]
Wilkinson F. (1990). Texture segmentation. In Stebbins W. C. Berkley M. A. (Eds.), Comparative perception (pp. 125–156). New York: John Wiley.
Wilkinson F. Wilson H. R. (1998). Measurement of the texture-coherence limit for bandpass arrays. Perception, 711–728. [PubMed] [Article]
Yellott J. I., Jr. (1993). Implications of triple correlation uniqueness for texture statistics and the Julesz conjecture. Journal of the Optical Society of America A, 10, 777–793. [CrossRef]
Figure 1
 
Examples of (A) excluded and (B) included natural textures. (A, top) Images that were excluded due to a subjective judgment that they were not sufficiently uniform, homogeneous, in focus, or contained prominent segmentable objects. (A, bottom) Images that were excluded during computer screening due to inhomogeneity of luminance or contrast between two or more quadrants. (B) Images that were included in the texture corpus.
Figure 1
 
Examples of (A) excluded and (B) included natural textures. (A, top) Images that were excluded due to a subjective judgment that they were not sufficiently uniform, homogeneous, in focus, or contained prominent segmentable objects. (A, bottom) Images that were excluded during computer screening due to inhomogeneity of luminance or contrast between two or more quadrants. (B) Images that were included in the texture corpus.
Figure 2
 
Examples of the stimuli used to determine modulation depth thresholds in Experiments 13 shown at three modulation depths (top to bottom: 75, 50, and 32). The envelope is a left- or right-oblique half-disk contrast modulation applied to an (left) intact or (right) phase-scrambled natural texture.
Figure 2
 
Examples of the stimuli used to determine modulation depth thresholds in Experiments 13 shown at three modulation depths (top to bottom: 75, 50, and 32). The envelope is a left- or right-oblique half-disk contrast modulation applied to an (left) intact or (right) phase-scrambled natural texture.
Figure 3
 
Modulation depth threshold results from Experiment 1 for four observers for intact and scrambled texture conditions. Thresholds were lower for the phase-scrambled textures (light bars) than for the intact textures (shaded bars). Error bars represent ±1 standard error.
Figure 3
 
Modulation depth threshold results from Experiment 1 for four observers for intact and scrambled texture conditions. Thresholds were lower for the phase-scrambled textures (light bars) than for the intact textures (shaded bars). Error bars represent ±1 standard error.
Figure 4
 
Modulation depth threshold results from Experiment 2 for three observers. Each symbol plots the phase-scrambled versus the intact threshold for a particular texture. In almost all 20 textures tested, for all three observers, the symbols lie below the 1:1 line (dashed), indicating that the intact threshold is higher than the phase-scrambled threshold. The amount of reduction, or the distance from the 1:1 line, is texture dependent. Error bars show the standard error on each measurement.
Figure 4
 
Modulation depth threshold results from Experiment 2 for three observers. Each symbol plots the phase-scrambled versus the intact threshold for a particular texture. In almost all 20 textures tested, for all three observers, the symbols lie below the 1:1 line (dashed), indicating that the intact threshold is higher than the phase-scrambled threshold. The amount of reduction, or the distance from the 1:1 line, is texture dependent. Error bars show the standard error on each measurement.
Figure 5
 
Histogram of texture carriers used in Experiment 2, based on average magnitude of the effect of phase scrambling. The textures that show a larger change (>2 dB) tend to have more prominent edges and appear to be more sparse.
Figure 5
 
Histogram of texture carriers used in Experiment 2, based on average magnitude of the effect of phase scrambling. The textures that show a larger change (>2 dB) tend to have more prominent edges and appear to be more sparse.
Figure 6
 
Stimuli used to determine carrier contrast thresholds in Experiment 3 shown at a range of carrier contrasts (top to bottom: 8, 5, and 3% RMS contrast), all with 100% modulation depth. Thresholds were determined for (A) intact and (B) scrambled textures.
Figure 6
 
Stimuli used to determine carrier contrast thresholds in Experiment 3 shown at a range of carrier contrasts (top to bottom: 8, 5, and 3% RMS contrast), all with 100% modulation depth. Thresholds were determined for (A) intact and (B) scrambled textures.
Figure 7
 
Carrier contrast threshold results from Experiment 3 for two observers. Each symbol plots the phase-scrambled vs. the intact threshold for a particular texture. The dashed line indicates where a texture's intact and phase-scrambled thresholds correspond exactly. Carrier contrast thresholds are centered on the 1:1 line, suggesting no systematic change in detectability when a texture is phase scrambled. Variation in thresholds between observers, textures, and conditions is minimal.
Figure 7
 
Carrier contrast threshold results from Experiment 3 for two observers. Each symbol plots the phase-scrambled vs. the intact threshold for a particular texture. The dashed line indicates where a texture's intact and phase-scrambled thresholds correspond exactly. Carrier contrast thresholds are centered on the 1:1 line, suggesting no systematic change in detectability when a texture is phase scrambled. Variation in thresholds between observers, textures, and conditions is minimal.
Figure 8
 
Modulation depth threshold results from Experiment 3 for two observers. Each symbol plots the phase-scrambled versus the intact threshold for a particular texture, with each texture a fixed increment above its detection threshold. The dashed line indicates where a texture's intact and phase-scrambled thresholds correspond exactly. Modulation depth thresholds measured with these detectability-equated contrasts are still systematically lower for scrambled than for intact textures.
Figure 8
 
Modulation depth threshold results from Experiment 3 for two observers. Each symbol plots the phase-scrambled versus the intact threshold for a particular texture, with each texture a fixed increment above its detection threshold. The dashed line indicates where a texture's intact and phase-scrambled thresholds correspond exactly. Modulation depth thresholds measured with these detectability-equated contrasts are still systematically lower for scrambled than for intact textures.
Figure 9
 
Relationship between image statistic indices and the change in segmentation threshold between intact and scrambled conditions in decibels. (A) Slope of falloff of Fourier spectrum. (B) Sparseness, as measured using intensity histogram kurtosis. (C) Sparseness, as measured with the LSSM metric of Hansen and Hess (2007)—a wavelet-based metric developed for natural scenes. (D) Edge density, modified from Bex (2010). Note the lack of relationship between threshold change and kurtosis, LSSM, or slope but a clear correlation between edge density and threshold change.
Figure 9
 
Relationship between image statistic indices and the change in segmentation threshold between intact and scrambled conditions in decibels. (A) Slope of falloff of Fourier spectrum. (B) Sparseness, as measured using intensity histogram kurtosis. (C) Sparseness, as measured with the LSSM metric of Hansen and Hess (2007)—a wavelet-based metric developed for natural scenes. (D) Edge density, modified from Bex (2010). Note the lack of relationship between threshold change and kurtosis, LSSM, or slope but a clear correlation between edge density and threshold change.
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×