Abstract
Unlike most visual tasks, contrast discrimination has been reported to be unchanged by practice (Dorais & Sagi, 1997;Adini, Sagi, & Tsodyks, 2002), unless practice is undertaken in the presence of flankers (context-enabled learning, Adini et al., 2002). Here we show that under experimental conditions nearly identical to those in the no-flanker practice experiment of Adini et al. (2002), practice significantly improved contrast discrimination. Moreover, in a separate experiment, we found that practice without flankers can improve contrast discrimination to a level only reached with flankers in Adini et al. (2002), but further practice with flankers produces no further improvement of contrast discrimination. These results call into question whether the “context-enabled learning” proposed by Adini et al. (2002) is different from regular contrast learning without flankers. In separate experiments, we found that contrast learning is tuned to spatial frequency, orientation, retinal location, and, unexpectedly, contrast. We also replicated Sagi, Adini, Tsodyks, and Wilkonsky’s (2003) more recent finding that no regular contrast learning occurs if reference contrasts are randomly interleaved (contrast roving), and further demonstrated that flankers have no effect on contrast learning under contrast roving, another piece of evidence equating “context-enabled learning” to regular contrast learning. The contrast specificity of learning and the lack of learning under contrast roving provide new evidence in favor of a multiple contrast-selective channels model of contrast discrimination, and against saturating transducer models and multiplicative noise models.
More than 30 observers, mostly University of California-Berkeley undergraduate students, with normal or corrected-to-normal vision participated in different phases of the study. Most were new to psychophysical experiments and unaware of the specific purposes of the experiments, though they were informed that the general goal of this study was to investigate whether visual performance could be improved by practice.
The stimuli in Experiments I–III were generated by a H program (Vision Research Graphics, Inc., Duham, NH) and presented on a 21-inch Image System Max21L monochrome monitor (1024 × 512 resolution, 0.28 mm (V) × 0.41 mm (2) pixel size, 117-Hz frame rate, 50 cd/mV mean luminance, and 3.8° × 3. 0° screen size at the 5.64-meter foveal viewing distance). Luminance of the monitor was made linear by means of a 15-bit look-up table. The stimuli in Experiments IV were generated by a 2 program (Neurometrics Institute, Berkeley, CA) and presented on a 19-inch Dell UltraScan P991 color monitor (640 × 480 resolution, 0.59 mm (H) × 0.54 mm (2) pixel size, 60-Hz frame rate, 60 cd/mV mean luminance, and 4.0° × 3.0° screen size at the 5-meter foveal viewing distance). Luminance of this monitor was made linear by means of an 8-bit lookup table. Experiments were run in a dimly lit room.
The test stimulus was a Gaussian windowed sinusoidal grating (Gabor patch). Under most stimulus conditions (foveal viewing), this Gabor patch had a spatial frequency of 6 cycles per degree (cpd), and the standard deviation of the Gaussian envelope was σ = 0.12°. In Experiments II and IV, additional flanking stimuli were used, either simulated by increasing the length of the pedestal, or by adding additional pairs of Gabor patches. Further details will be provided in “Results.”
Contrast thresholds were measured with a temporal two-alternative forced-choice (2AFC) staircase procedure. In Experiments I∼III, staircases at different reference contrasts were run non-interleaved. In Experiments IV, staircases at all reference contrasts were run randomly interleaved (contrast roving). Within a staircase, the test and reference stimuli were separately presented in the two stimulus intervals (≈103 msec each) in a random order separated by a 600-msec interstimulus interval. Each stimulus interval was accompanied by an auditory tone of the same duration to reduce temporal uncertainty. The observers’ task was to judge which stimulus interval contained the higher contrast Gabor. Each trial was preceded by a 6.3′ × 6.3′ fixation cross which disappeared 100 msec before the beginning of the trial. Auditory feedback was given on incorrect responses. Each staircase consisted of four preliminary reversals and eight experimental reversals when run non-interleaved, or two preliminary reversals and six experimental reversals when run interleaved. The step size of the staircase was 0.05 log units. A classical 3-down-1-up staircase rule was followed, which resulted in a 79.4% convergence level of the staircase. The geometric mean of the experimental reversals was taken as the contrast threshold for each staircase run.
Experiment I. Perceptual learning of contrast discrimination at multiple contrasts
Experiment II. Context-enabled learning after exclusion of regular contrast learning
Experiment III. The specificities of contrast learning: Stimulus dimensions, retinal location, and eye of origin
Experiment IV. Contrast learning with roving contrasts
Some differences between our study and the Adini et al. study
Why does contrast learning not transfer to neighboring contrasts? Three models of contrast discrimination
Why is there such a large interest in visual learning? We suspect that this interest stems from the possibility that the learning takes place in early stages of visual processing. Learning in late stages of processing would be less interesting because there are already many examples of cognitive learning tasks. For example, if the visual task involved detecting a subtle pattern with many distracters, we would not be surprised to find strong learning effects as one learns to recognize and discount the distracters. We are more interested when we find learning with simple patterns with aspects such as non-transfer to different locations or orientations that indicate that the learning might take place in early stages of processing. However, as noted above, Vogel and Orban (
1994) and Mollon and Danilova (
1996) showed that non-transfer of learning, often thought to be early, can be explained by central mechanisms.
The massive interconnectedness of cortex makes it difficult to separate early and late stages of processing even when using brain-imaging techniques. After about 150 msec, it is expected that effects of late decision stages will affect V1 processing through feedback (Lee & Mumford,
2003). In order to be concrete about the subtle distinction between learning occurring in early versus late stages of processing, we propose the following simple operational definition. Learning at an early stage would allow a microelectrode implanted in an early visual area (say in V1) to produce a direct correlate of learning in the first 125 msec of response. For example, if noise reduction (the second model of
Figure 6) occurred early, then a microelectrode in a neuron responding to the Gabor patch would show reduced noise in its early firing rate after learning. On the other hand, learning at a late stage would not affect the initial responses of neurons in primary visual cortex. For example, reduced noise in the comparison stage (e.g., by improving the memorized contrast template or by learning a more efficient way to compare the memory to the test stimulus) need not show up in the initial firing rate and would therefore be called late stage learning even if the computations were carried out in V1. Similarly, with our selective mechanism hypothesis (the third model of
Figure 6), a decision stage with access to the multiple mechanisms spanning the full contrast range would be needed. Thus for the contrast selective mechanism hypothesis, the main action is carried out at a higher stage of processing. The first two hypotheses (change in the contrast response function or the multiplicative noise in neurons in primary visual cortex) would allow a perceptual decision to be made based on the activity of single cells or cell assemblies in early vision. In an important sense we are using early and late as referring to the temporal domain as well as to whether the learning is top-down. This distinction is relevant to our earlier mention of late versus early mechanisms for learning.
It is worth mentioning that evidence directly linking perceptual learning to neural plasticity has been scarce and inconclusive. Schoups, Vogels, Qian, and Orban (
2001) recently reported sharpening of orientation tuning functions after orientation discrimination practice in the primary visual cortex of monkeys, but Ghose, Young, and Maunsell (
2002) failed to find evidence for similar physiological correlates of orientation discrimination learning in a separate monkey study and referred the behavioral performance improvement to more central pooling and decision processes (similar to the third model in
Figure 6). Recent work suggests that such learning does take place in V4 (Yang & Maunsell,
2004). It would be interesting to know whether more central visual processing is universal in perceptual learning of other visual discrimination tasks, such as phase and spatial frequency discrimination and Vernier acuity, even if we cannot completely exclude the role of early neural plasticity in contrast learning.
Given that high-level (late) learning is well established, the burden of proof in the early versus late argument should, therefore, be on the side of those arguing that learning is done early. By this reasoning one need not provide evidence against early learning. However, we suggest that our roving experiments do provide evidence against the learning being early.
Why does roving inhibit contrast learning? Perhaps the most surprising result of our study, as well as Sagi et al. (
2003), is that roving among 4 reference contrasts, in a 2AFC experiment, inhibits learning (
Figure 5). The difference between a roving experiment and a blocked experiment is that in the former the only discrimination cue is the contrast difference between the first and second intervals. In a blocked design experiment, there is the additional cue that after a few trials a long-term memory trace of the reference contrast is built up and that memory trace can be used as a reference for both intervals of the 2AFC trial. It may be useful to illustrate the two cases where the perceived signal strength is represented by a number. Suppose the first interval of the roving trial has a perceived strength of 30±5 and the second interval has strength of 34±5. In this case, the perceived contrast difference would have
d′=(34–30)/5, which is below the
d′=1 threshold. For the blocked experiment, the perceived strength of the signal relative to the memorized reference would be smaller, more accurate numbers such as 2±3 and 6±3 for the first and second intervals. In this case, the
d′ of the contrast difference would be (6−2)/3, which is above the
d′=1 threshold. In the blocked case, learning could decrease thresholds by either improved memorization of a stable reference template or by learning to more accurately compare the memorized reference to each test.
We now examine how the three models of contrast discrimination discussed in the preceding section would deal with the results of our roving experiment. For both the contrast response function model and the multiplicative noise model, it is hard to see why roving would make it more difficult to learn contrast discrimination. If practice facilitates the contrast response function or reduces the noise at one contrast level, there is no obvious reason why it should not do the same if the contrasts are roving. (Of course one could always develop post hoc models with assumptions that make it difficult to do learning with roving contrasts.) For the multiple contrast-selective channels model, on the other hand, there is a natural explanation for why roving causes a problem. As discussed in the section on transfer of contrast learning, according to this model, learning takes place because the decision stage learns to attend to the optimally sensitive mechanisms. In the presence of roving, this type of selective attention would not be possible because attention would be spread out, as in the pre-practice runs.
We thank Dov Sagi for communications, Ariella Popple for helping initiate this study, Yasoto Tanaka for discussions as an insider of both our study and Sagi et al., and our 30+ subjects for their hard work. This research is supported by National Institute of Health Grants R01EY01728 and R01EY04776.
Commercial relationships: none.
Corresponding author: Cong Yu.
Address: Chinese Academy of Sciences, Institute of Neuroscience, Shanghai, China.
% MATLAB code with the full details of the modeling that generated
Figure 6. Comments are given in green.
c=0:.01:8; %the pedestal contrasts being sampled
type=[‘b- r--’]; %blue and red are pre- and post-learning
for ia=0:5 % even and odd numbers for pre- and post-learning
% ia=0,1 for top; 2,3 for middle; 4,5 for bottom panels
n=c.^0; %Constant noise [for model 1 (top panels) and model 3 (bottom panels)] ia2=mod(ia,2);ia12=ia-ia2 %for plotting conditions
if ia<2, anum=ia*.2; a=2; %conditions for model 1
elseif ia<4, anum=0; a=10; n=(1+c).^.7-ia2*exp(-(c-4).^2);%for model 2 (middle panels)
resp= (a+1)*c.^2./(a+c.^1.5)+anum*exp(-(c-4).^2); %contrast response function for Models 1 ⇐p; 2
subplot(3,2,1+ia12); %specify where to place the plot
if ia<2, plot(c,resp,type(3*ia2+1:3*ia2+3),c,n); hold on %plot left panel of Model 1
elseif ia<4, plot(c,resp,c,n,type(3*ia2+1:3*ia2+3));hold on %plot left panel of Model 2
else offset=[.4*1.3.^[0:13]]-.4; %calculate plot 5. All the shifts of black curves
if ia==5; offset=[offset offset(10)+.05];end %add an extra mechanism post-learning
for iplot=1:length(offset)
off=offset(iplot); %offsets for contrast response functions of plot 5
cshift=(c-off).*(cτ;off); %contrast of shifted curves
R(iplot,:)=10*cshift.^2./(1+cshift.^2);%Naka-Rushton type saturation response
resp=mean(R); %the CRF is the average of all the separate neural responses
if ia==4; subplot(3,2,5);plot(c,R,‘k’,c,resp,‘b’);hold on %plot Model 3 pre-learning
else subplot(3,2,5);plot(c,R(end,:),‘k’,c,resp,‘r--’,[0 8],[1 1],‘b’);%3 post-learning
ylabel(‘CRF and noise’);if ia==0, title(‘contrast response functions and noise’);end
[rmin,cmin]=min(abs((resp-resp(i))./n-1));%solves {(CRF(cmin)-CRF(c))/noise = 1} for cmincjnd(i)=cmin/100-c(i);%The jnd contrast. The /100 converts sample units to contrast units
subplot(3,2,2+ia12);plot(c,cjnd,type(3*ia2+1:3*ia2+3));hold on %plot right panelsif ia==1, title(′ jnd. pre (black) and post (red)′);end;
axis([0,6,0,2.2]); grid on