Research has shown that the processing time for discriminating illusory contours is longer than for real contours. We know, however, little whether the visual processes, associated with detecting regions of illusory surfaces, are also slower as those responsible for detecting luminance-defined images. Using a speed–accuracy trade-off (SAT) procedure, we measured accuracy as a function of processing time for detecting illusory Kanizsa-type and luminance-defined squares embedded in 2D static luminance noise. The data revealed that the illusory images were detected at slower processing speed than the real images, while the points in time, when accuracy departed from chance, were not significantly different for both stimuli. The classification images for detecting illusory and real squares showed that observers employed similar detection strategies using surface regions of the real and illusory squares. The lack of significant differences between the *x*-intercepts of the SAT functions for illusory and luminance-modulated stimuli suggests that the detection of surface regions of both images could be based on activation of a single mechanism (the dorsal magnocellular visual pathway). The slower speed for detecting illusory images as compared to luminance-defined images could be attributed to slower processes of filling-in of regions of illusory images within the dorsal pathway.

*What is the neural representation of such illusory images?*Initially, completion of illusory contours was interpreted by cognitive theories as an attempt to find the most probable solution to a perceptual problem (Gregory, 1972). Computational models (Grossberg & Mingolla, 1985; Hess & Field, 1999; Spillmann & Dresp, 1995) and neuroimaging studies (for review, see Seghier & Vuilleumier, 2006) have proposed that distinct global attributes of an illusory object are processed by separate mechanisms. A fast local low-level mechanism including cells in striate and extrastriate visual areas (Bakin, Nakayama, & Gilbert, 2000; Lee & Nguyen, 2001; Nieder & Wagner, 1999; Peterhans & von der Heydt, 1989; Ramsden, Hung, & Roe, 2001; Redies, Crook, & Creutzfeldt, 1986; von der Heydt, Peterhans, & Baumgartner, 1984) is responsible for initial encoding of illusory contours. Intracranial recordings from cells in monkeys showed that Kanizsa-type illusory contours activated first cells in V2 (70–95 ms after stimulus onset) followed by a later response in V1 at 100–190 ms (Lee & Nguyen, 2001). The responses of V1 cells to contours of real (bright, gray, and outline) squares appeared earlier (45 ms) as compared to the latency of the responses to the illusory contours (100 ms) induced by the Kanizsa square. These findings suggested that the contour completion in V1 might be due to feedback modulations from V2 cells.

*Is there a difference between the perceptual dynamics of real and illusory image processing?*Using backward masking of Kanizsa illusory figures, Reynolds (1981) found that at short stimulus onset asynchronies (SOAs of 50 ms) subjects reported seeing only distinct inducers of the illusory figure, followed by seeing the illusory figure (SOAs of 75 ms) while the detailed figure shape (curved or straight edged) can be discriminated at longer SOAs (>100 ms). Ringach and Shapley (1996) studied the time course of illusory contour processing by using a shape discrimination task of Kanizsa-type figures produced by inducers that were rotated to form fat (bulged outward) or thin (tapered inward) illusory shapes. The results showed that the performance for discriminating fat and thin illusory contours was reduced by a mask containing local orientation information when flashed at SOAs of less than 117 ms. A second Kanizsa-type mask interfered with task performance at longer stimulus onset asynchronies (140–200 ms). The authors suggested that illusory contour processing involved two stages: detection of local boundary segments followed by integration of global illusory contours. Using the same paradigm, Imber, Shapley, and Rubin (2005) found that late masking effects of illusory contours could be observed even with illusory maskers that did not overlap spatially with the target illusory contours. In contrast, real luminance-defined contours were not effective as late-stage masking stimuli. These results led to the suggestion that late-stage masking may occur at visual cortical stages that are involved in shape categorization of illusory surfaces bounded by illusory contours.

^{2}and size of 30 × 23 deg. A custom video summation device (Pelli & Zhang, 1991) was used to produce 256 gray levels with a 12-bit precision. The luminance response of the display was measured by an OptiCal photometer (Cambridge Research System) interfaced to the PC. The monitor luminance was linearized using the inverse function of the non-linear luminance response when computing the stimulus images. Stimuli were viewed binocularly at a viewing distance of 60 cm. Participants were in a darkened room where the only source of light was the computer display. Custom software written in Pascal for MSDOS was used to generate the stimuli and control the experiment.

*C*) of 4% defined as

*SD*s) from the background.

*SD*(5%), and long stimulus duration (1000 ms). After that, the frame contrast was lowered to 4% and the stimulus duration was shortened to 106.7 ms. Both illusory and real squares were used during the training sessions. Observers were trained to respond within 300 ms after the response cue. Noise

*SD*was varied to keep detectability index below 2 at the longest response cue lag. After the training sessions, each observer participated in 10 sessions, the results of which were used in the data analysis. At least 100 trials for each experimental condition were collected at each response lag for each participant.

*SD*s. In Experiment 2, noise

*SD*was fixed and real decremental squares of 3 contrast levels (−0.3, −0.5, and −0.7%) were presented. Trials with illusory and real squares were presented in separated blocks. The response cue was presented at 6 different lags between 120 and 907 ms after the stimulus onset. The response lag was randomly varied across trials. In both experiments, detectability index (

*d*′) was measured as a function of processing time (lag plus mean response latency).

*SD*of 14% (KR) and 12% (MSM); the contrast of the real squares was −0.3%.

*d*′ units) for each experimental condition was computed using the

*z*score for hit rates for target-present trials and the

*z*score for false-alarm rates for target-absent trials at each response lag. The empirical SAT functions were fit with an exponential function (Dosher, 1976, 1979; McElree, 1993; McElree & Carrasco, 1999):

*λ*is the asymptotic parameter corresponding to detectability at maximum processing time,

*δ*is the

*x*-intercept parameter reflecting the discrete point in time when

*d*′ = 0 (e.g., sensory encoding, transmission, and motor response delays), and

*β*is the rate parameter indexing the speed with which detectability grows from chance to asymptote.

*λ*),

*x*-intercept (

*δ*), and rate (

*β*) to a fully saturated model in which each function was fit with a unique set of parameters (3

*λ*–3

*β*–3

*δ*).

_{c}), which was calculated using the following equation (Burnham & Anderson, 2002, pp. 60–85):

*α*

_{ i }are the data values,

*α*

_{ i est }are the model calculations,

*n*is the number of data points, and

*K*is the number of free parameters plus one.

_{c}approach is based on information theory and does not use the traditional “hypothesis testing” statistical paradigm, rather it determines how well the data supports each model. The model with the smallest AIC

_{c}value is most likely to be correct. If

*A*

_{ a }and

*A*

_{ b }are the AIC

_{c}values for models

*a*and

*b,*respectively, and

*A*

_{ a }<

*A*

_{ b }(Δ =

*A*

_{ a }−

*A*

_{ b }> 0), Akaike's weight:

*a*is correct.

^{T}. The standard error of the classification images was calculated as (Keane et al., 2007; Murray, Bennett, & Sekuler, 2002)

*k*

_{ ij }represents the kernel elements (

*i, j*),

*m*is the kernel size,

*n*

_{00},

*n*

_{01},

*n*

_{10}, and

*n*

_{11}denote the number of trials in each stimulus-response category, and

*σ*

_{N}is the standard deviation of the external noise.

*t*-value) and the probability density function for the

*t*-distribution. In order to correct for multiple comparisons, we employed an adaptive procedure for controlling the false discovery rate (Benjamini, Krieger, & Yekutieli, 2006) using a Matlab function written by David Groppe (http://www.mathworks.com/matlabcentral/fileexchange/27423-two-stage-benjamini-krieger-yekutieli-fdr-procedure). This procedure is a less conservative and more powerful method for correcting for multiple comparisons than the Bonferroni procedure.

*SD*decreased. The average

*d*′ values of the empirical asymptotic accuracy [the average accuracy at the two longest lag (McElree & Carrasco, 1999)] were 1.99, 1.29, and 0.98 for low, medium, and high noise levels, respectively. The Shapiro–Wilks test found that the data did not show significant departure from normality. This allowed using ANOVA, which showed a main effect of noise

*SD*on asymptotic accuracy (ANOVA,

*F*(2,9) = 26.9,

*p*< 0.001). For each subject, we evaluated the significance of the differences in

*d*′ measured in two experimental conditions using binomial statistics (Macmillan & Creelman, 2005). Pair-wise comparisons showed that the asymptotic accuracies for subjects IH, JGI, and MSM at medium and low noise levels were significantly (

*p*< 0.05, Bonferroni correction) higher than those at high noise level, while the asymptotic accuracy differences for subject KR were not significant.

*d*′. The Shapiro–Wilks test showed that the RTs in 5 out of 6 data sets did not have normal distributions. Therefore, we used a non-parametric method for testing equality of population medians among conditions. The Kruskal–Wallis ANOVA did not find a significant effect of the noise

*SD*on the RTs for subject IH [median value and median absolute deviation: 374 ± 29.9 ms (low

*SD*), 372 ± 32.1 ms (medium

*SD*), 376 ± 22.0 ms (high

*SD*)] and JGI [367.5 ± 29.5 ms (low

*SD*), 374 ± 25.8 ms (medium

*SD*), 387 ± 27.9 ms (high

*SD*)].

*λ*–1

*β*–1

*δ*model produced the smallest value of the AIC

_{c}: −38.2 (IH), −37.0 (KR), −49.2 (JGI), and −51.1 (MSM). The differences between the AIC

_{c}values yielded by the other models and those of the 3

*λ*–1

*β*–1

*δ*model were in the range of 5.27–39.1 (evidence ratio > 14; Akaike's weight > 93.3%). According to Akaike's method, these findings indicate that the 3

*λ*–1

*β*–1

*δ*model would be at least 14 times more likely of being correct than the other models. Additionally, the

*λ*values estimated by means of the 3

*λ*–1

*β*–1

*δ*model were identically ordered for all observers.

Subject | Model (λ–β–δ) | |||||
---|---|---|---|---|---|---|

1–1–1 | 3–1–1 | 3–1–3 | 3–3–1 | 3–3–3 | ||

IH | AIC_{c} | −17.6 | −38.2 | −28.5 | −27.3 | −10.1 |

ΔAIC_{c} | 20.6 | 9.76 | 10.9 | 28.1 | ||

AW (%) | >99.9 | 99.2 | 99.6 | >99.9 | ||

ER | 3 × 10^{4} | 132 | 234 | 10^{8} | ||

JGI | AIC_{c} | −18.2 | −49.2 | −39.6 | −39.8 | −22.8 |

ΔAIC_{c} | 31.03 | 9.65 | 9.49 | 26.5 | ||

AW (%) | >99.9 | 99.2 | 99.1 | >99.9 | ||

ER | 6 × 10^{6} | 124 | 115 | 6 × 10^{5} | ||

KR | AIC_{c} | −31.7 | −37 | −27.6 | −28.5 | −9.57 |

ΔAIC_{c} | 5.27 | 9.42 | 8.52 | 27.5 | ||

AW (%) | 93.3 | 99.1 | 98.6 | >99.9 | ||

ER | 14 | 111 | 71 | 9 × 10^{5} | ||

MSM | AIC_{c} | −12. | −51.1 | −39.7 | −41.2 | −23 |

ΔAIC_{c} | 39.1 | 11.4 | 9.88 | 28.2 | ||

AW (%) | >99.9 | 99.7 | 99.3 | >99.9 | ||

ER | 3 × 10^{8} | 300 | 140 | 1.3 × 10^{6} |

*SD*was selected to produce asymptotic accuracy of about 1

*d*′ units for detecting the illusory target for each observer: 10% (JGI), 12% (IH, MSM), and 14% (KR). Real decremental squares of three contrast levels (−0.3, −0.5, and −0.7%) were embedded in luminance noise and surrounded by an incremental frame (Figure 1a).

*d*′ values of the empirical asymptotic accuracy were 1.56, 2, and 2.17 for target contrasts of −0.3, −0.5, and −0.7%, respectively. The results of the Shapiro–Wilks test showed that the data did not significantly depart from normality. ANOVA found a significant effect of target contrast on the empirical asymptotic accuracy averaged across subjects (

*F*(2,9) = 4.28,

*p*< 0.05). Pair-wise comparisons, using binomial statistics (Macmillan & Creelman, 2005), showed that the asymptotic accuracies of subjects IH and KR at target contrasts of −0.5 and −0.7% were significantly (

*p*< 0.05, Bonferroni correction) higher than those at −0.3% contrast. The asymptotic accuracy data of subjects JGI and MSM did not reach significant differences.

*d*′. The Shapiro–Wilks test found that the reaction times in 4 out of 6 data sets did not have normal distributions. The Kruskal–Wallis test did not show a significant effect of the stimulus contrast on the RTs for subject IH [median value and median absolute deviation: 387 ± 30.3 ms (−0.3% contrast), 374 ± 27.3 ms (−0.5% contrast), 401 ± 29.6 ms (−0.7% contrast)] and JGI [374 ± 30.6 ms (−0.3% contrast), 373 ± 28.0 ms (−0.5% contrast), 374 ± 30.5 ms (−0.7% contrast)].

*λ*–1

*β*–1

*δ*model produced the smallest values of the AIC

_{c}for all observers: −35.2 (IH), −49.7 (JGI), −41.3 (KR), and −49 (MSM). The differences between the AIC

_{c}values yielded by the other models and those of the 3

*λ*–1

*β*–1

*δ*model were in the range of 6.14–37.4 (evidence ratio > 22; Akaike's weight > 95.6%). These findings indicate that the 3

*λ*–1

*β*–1

*δ*model would be at least 22 times more likely of being correct than the other models. The

*λ*values estimated by the 3

*λ*–1

*β*–1

*δ*model were identically ordered for all observers.

Subject | Model (λ–β–δ) | |||||
---|---|---|---|---|---|---|

1–1–1 | 3–1–1 | 3–1–3 | 3–3–1 | 3–3–3 | ||

IH | AIC_{c} | −13.3 | −35.2 | −23.4 | −25.2 | −6.26 |

ΔAIC_{c} | 21.9 | 11.8 | 9.96 | 28.9 | ||

AW (%) | >99.9 | 99.7 | 99.3 | >99.9 | ||

ER | 6 × 10^{4} | 372 | 145 | 2 × 10^{6} | ||

JGI | AIC_{c} | −43.5 | −49.7 | −37.4 | −37.8 | −21.9 |

ΔAIC_{c} | 6.14 | 12.3 | 11.9 | 27.8 | ||

AW (%) | 95.6 | 99.8 | 99.7 | >99.9 | ||

ER | 21.5 | 469 | 377 | 10^{6} | ||

KR | AIC_{c} | −15.5 | −41.3 | −30.3 | −31.3 | −20.4 |

ΔAIC_{c} | 25.8 | 11.01 | 10.06 | 20.9 | ||

AW (%) | >99.9 | 99.6 | 99.4 | >99.9 | ||

ER | 4 × 10^{5} | 245 | 153 | 3 × 10^{4} | ||

MSM | AIC_{c} | −11.7 | −49 | −38.7 | −39 | −19.6 |

ΔAIC_{c} | 37.4 | 10.27 | 10 | 29.4 | ||

AW (%) | >99.9 | 99.4 | 99.3 | >99.9 | ||

ER | 10^{8} | 170 | 148 | 3 × 10^{6} |

*λ*), processing speed (

*β*), and

*x*-intercept (

*δ*)] of the models used to fit the SAT functions for real and illusory images are shown in Table 3. The Shapiro–Wilks test found that the estimated speed parameters (

*β*), intercept parameters (

*δ*), and asymptotic parameters (

*λ*) did not significantly depart from normality. The speed parameters for illusory squares were significantly (paired

*t*-test,

*p*< 0.005) slower than those for real squares. The mean value (±

*SD*) of the difference between the speed parameters (in 1/

*β*units) for illusory and real squares was 32 ± 5.5 ms [25.2 (IH), 32.7 (JGI), 38.5 (KR), and 30.1 (MSM) ms]. The intercept parameters for both stimuli were not significantly different: the mean value (±

*SD*) was −2.2 ± 8.2 ms [−10.5 (IH), −1.2 (JGI), 8.7 (KR), and −5.6 (MSM) ms]. Combining the intercept (

*δ*) and the time interval (1/

*β*) within which accuracy grows from chance to a fixed level (63%) of the asymptotic accuracy into a composite measure (McElree & Carrasco, 1999) of processing dynamics for each subject showed that the processing dynamics for illusory squares was significantly (mean and

*SD*: 29.7 ± 13.6, paired

*t*-test,

*p*< 0.05) slower by 14.6 (IH), 31.5 (JGI), 47.3 (KR), and 25.3 (MSM) ms than for real squares.

Subject | Noise SD of illusory stimuli | Contrast of real stimuli (%) | |||||
---|---|---|---|---|---|---|---|

Low | Medium | High | −0.3 | −0.5 | −0.7 | ||

IH | λ | 1.92 | 1.53 | 0.75 | 1.5 | 2.37 | 2.47 |

β(1/β) | 0.0065 (155) | 0.0077 (129) | |||||

δ | 288 | 299 | |||||

JGI | λ | 2.13 | 1.55 | 0.9 | 1.6 | 1.7 | 1.92 |

β(1/β) | 0.0055 (182) | 0.0067 (149) | |||||

δ | 249 | 250 | |||||

KR | λ | 1.72 | 1.35 | 1.03 | 1.64 | 2.36 | 2.60 |

β(1/β) | 0.0051 (197) | 0.0063 (159) | |||||

δ | 290 | 281 | |||||

MSM | λ | 2.15 | 1.10 | 0.86 | 1.44 | 1.81 | 2.00 |

β(1/β) | 0.0133 (75) | 0.0227 (44) | |||||

δ | 297 | 303 |

*SD*for each subject [10% (JGI), 12% (IH, MSM), and 14% (KR)]. ANOVA showed a main effect of contrast across subjects (

*F*(3,12) = 9.2,

*p*< 0.005). Post hoc Tukey HSD test found that the asymptotic accuracies, 2.06 and 2.24, averaged across subjects for real squares of −0.5 and −0.7% contrast levels were significantly (

*p*< 0.05 and

*p*< 0.005, respectively) higher than that (1.38) for illusory squares (zero contrast).

*SD*, we analyzed the best-fitted values of the 3

*λ*–3

*β*–3

*δ*model parameters. The mean processing speed (in 1/

*β*units; Figure 4, black circles) decreased as the contrast of the square increased from 0 to −0.7%. ANOVA found a main effect of target contrast level (

*F*(3,12) = 5.2,

*p*< 0.05). Post hoc Tukey HSD test showed that the mean processing speed (104 and 102 ms) for detecting real squares of −0.5 and −0.7% contrast levels were significantly (

*p*< 0.05) faster than that (151 ms) for detecting illusory squares of zero contrast. The mean processing speed for detecting real squares of −0.3% contrast (119 ms) was faster but not significantly different from that for detecting illusory squares of zero contrast. The values of the intercept parameter (

*δ*) did not show a significant main effect of target contrast.

*SD*for illusory images and stimulus contrast real images) decreased. To compare the error rates on stimulus strength, which were normally distributed, we used one-way ANOVA. The results showed an effect of stimulus strength (

*F*(2,9) = 4.48,

*p*< 0.05). Post hoc Tukey HSD test found that the mean error rate for detecting illusory squares embedded in low noise

*SD*and real squares of −0.7% contrast (13 ± 5%) was significantly (

*p*< 0.05) lower than those of higher difficulty (illusory squares embedded in medium noise

*SD*and real squares of −0.5% contrast (17 ± 10%); illusory squares embedded in high noise

*SD*and real squares of −0.3% contrast (27 ± 14%)).

*p*< 0.05), corrected by false discovery rate (Benjamini et al., 2006) for controlling multiple comparisons, in the corresponding classification images shown in the upper row of Figure 5a. Red pixels are significantly larger than zero; blue pixels are significantly less than zero. The left gray image in Figure 5a represents the classification image for the ideal observer that uses all available information about the stimuli. This classification image shows that the ideal observer uses regions within the area of the real square (blue pixels in Figure 5a, left lower image). The ideal observer uses also a small number of pixels of incremental luminance at the edge of the real square due to differences between the luminance profiles of the frames in the target and non-target stimuli (red pixels in Figure 5a, left lower image). The classification images for observers KR and MSM show that these observers used mainly central and lower regions of the real squares [blue pixels in Figure 5a, middle (KR) and right (MSM) lower images].

*p*< 0.001) only within the gaps within the luminance profiles of the frames of the target-present and target-absent stimuli. The blue (red) patches represent areas having lower (higher) luminance in the target-present than in target-absent stimuli. The classification images for illusory squares [Figure 5b, middle (KR) and right (MSM) columns], however, show that the observers' judgments were based on information from regions within the area of the illusory square; 94% (KR) and 98% (MSM) of the pixels that reached statistical significance [blue pixels in Figure 5b, middle (KR) and right (MSM) lower images] were located within the area of the illusory square.

*d*′ value for detecting the real image was significantly higher than that for the illusory image: KR,

*t*= 3.54,

*p*< 0.001; MSM,

*t*= 5.39,

*p*< 0.001.

*λ*–1

*β*–1

*δ*model; Table 3). The SAT data for illusory squares were also satisfactorily explained by a 3

*λ*–1

*β*–1

*δ*model. The points in time (

*x*-intercepts) when accuracy departs from chance were not significantly different for illusory and real squares. However, the illusory image was processed at slower speed (32 ms in 1/

*β*units) than the real image. The composite measure of processing dynamics (

*δ*+ 1/

*β*) for illusory squares were also significantly slower by 30 ms than that for real squares.

*SD*. Using the best-fitted values of the free parameters, estimated by the 3

*λ*–3

*β*–3

*δ*model (Figure 4), the processing speeds for detecting real targets at −0.5 and −0.7% contrast levels were significantly faster than that for detecting the illusory square (zero contrast). The processing speed for detecting a real target of −0.3% was faster (by 32 ms in 1/

*β*units) than for the illusory square, but this difference was not significant, which can be due to insufficient number of tested subjects and/or insufficient number of experimental trials. Another possibility is that the luminance-modulated square of −0.3% contrast could be a weak stimulus to produce significant effects on the processing speed as compared to the effects of the incremental Kanizsa-type frame, surrounding the real target.

*x*-intercepts of the SAT functions for both patterns were similar, which did not confirm the above prediction. This suggests that the late masking could reflect processes of contour completion rather than illusory surface representation. This suggestion could be tested by investigating the SAT functions in a shape discrimination task, which may find a shift to longer processing times of SAT functions for illusory contours as compared to those for real contours.