Abstract
We compared perceptual learning in 16 psychophysical studies, ranging from low-level spatial frequency and orientation discrimination tasks to high-level object and face-recognition tasks. All studies examined learning over at least four sessions and were carried out foveally or using free fixation. Comparison of learning effects across this wide range of tasks demonstrates that the amount of learning varies widely between different tasks. A variety of factors seems to affect learning, including the number of perceptual dimensions relevant to the task, external noise, familiarity, and task complexity.
1. Cardinal direction of motion discrimination for a single dot
2. Resolution limit for gratings
3. Cardinal direction of motion discrimination for a field of dots
4. Oblique orientation discrimination
5. Spatial frequency discrimination for a simple plaid
6. Familiar object identification
7. Oblique direction of motion discrimination for a field of dots
8. Oblique direction of motion discrimination for a single dot
9. Spatial frequency discrimination for a complex plaid
10. Cardinal orientation discrimination
11. Vernier offset discrimination
12. Band-pass noise identification with high-contrast noise
13. Band-pass noise identification with low-contrast noise
14. Novel face discrimination with high-contrast noise
15. Simple shape search
16. Novel face discrimination with low-contrast noise
In a typical yes-no psychophysical task, an observer is presented with an observation interval that contains noise (
n) alone or contains both signal and noise (
s). The observer responds
yes (
S) if she believes the signal was present and
no (
N) otherwise.
e is the sensory event associated with the observation interval.
P(
s) is the a priori probability of the signal, and
P(
s|
e) is the a posteriori probability that signal occurred, given the evidence
e. Using Bayes rule,
In such tasks, observers necessarily have a criterion (
βp) for responding
S and
N, based on the evidence provided by the observation interval. So for a given criterion, we can describe our subject’s behavior as follows:
For example, in the extreme case, if an observer were rewarded for saying yes correctly, and was not penalized for saying yes incorrectly, she might choose the criterion βp=0, and say S on all trials, regardless of the sensory evidence (e).
P(S|s) is the probability of a hit: saying yes when the signal was present. P(S|n) is the probability of a false alarm: saying yes when only noise was present. P(N|s) is a miss, and P(N|n) is a correct rejection. A receiver-operating curve (ROC) shows how the probability of hits and false alarms change as an observer bases her responses on different criteria. As the observer lowers her criterion, the number of hits increases, but so do the number of false alarms. Because an observer only has the choice of responding yes or no, P(S|s)+P(N|s)=1 and P(S|n)+P(N|n)=1. The ROC curve, therefore, also describes the number of misses and correct rejections. If signal and noise are equally likely, and the observer chooses a criterion that maximizes the probability correct, then the probability correct is simply p(c)=P(S|s) or equally p(c)=P(N|n).
The likelihood ratio
lsn(
e) provides a measure of the probability of evidence
e given that the signal occurred, relative to the probability of
e given that noise occurred:
Note that the likelihood ratio is independent of the a priori probability of signal and noise. The likelihood ratio is monotonically related to the a posteriori probability, provided the a priori probabilities are not zero. Because the two scales are monotonically related, criterions based on a posteriori decision rules (
βp) and the more conventionally used likelihood ratio (
β) are related. For example, when signal and noise are equally probable (i.e.,
P(
s)=
P(
n)=
0.5) it can be shown that
and a likelihood ratio criterion of
β has an exact equivalent in terms of a posteriori probabilities, such that
Conveniently, in a yes-no task, the slope of the ROC curve at any point is equal to the likelihood ratio criterion that generated that point.
In many psychophysical procedures, correct decisions (hits and correct rejections) are equally rewarded, and errors (false alarms and misses) are equally penalized. In this case, the optimal decision rule is to choose a criterion that maximizes the number of hits and minimizes the number of false alarms, i.e., maximizes P(S|s)-P(S|n) (where noise and signal are equally likely). The best strategy is to choose S if and only if lsn(e)>=β. Where false alarms were not measured, we assume in our analysis that observers weight hits and correct rejections equally and false alarms and misses equally. In all the studies we reviewed, error feedback did not distinguish between hits and correct rejections or between false alarms and misses.
The following is a MATLAB program that calculates d’ from psychometric functions. Some of the routines in our simulations made use of the psychophysics toolbox (
Brainard, 1997;
Pelli, 1997).
% example code showing how to calculate d prime from psychometric functions
% other necessary functions are
% contrast values for the psychometric function
% theoretic percent correct for each contrast, before and after training
per_correct_before=[0.5000 0.5371 0.6326 0.7500 0.8542 0.9271];
per_correct_after= [0.5000 0.6326 0.8542 0.9688 0.9964 0.9998];
%initial estimate of separation
%find dprime for each contrast
% IF YOUR MACHINE DOESN’T HAVE FMINS TRY USING FMINSEARCH %- EQUIVALENT FUNCTIONS
dprime_before(i)=fmins(‘FitdvPercent’, init_dprime, [], [],per_correct_before(i), task);
dprime_after(i)=fmins(‘FitdvPercent’, init_dprime, [],[],per_correct_after(i), task);
dprime_before(i)=fminsearch(‘FitdvPercent’, init_dprime,[],per_correct_before(i), task);
dprime_after(i)=fminsearch(‘FitdvPercent’, init_dprime, [],per_correct_after(i), task);
% find the contrast for which dprime=0.5 before training
interp_contrast=interp1(dprime_before, contrast, init_dprime); %the contrast for which d’==.5
% find the dprime value after training for the contrast at which dprime=0.5
new_dprime=interp1(contrast, dprime_after, interp_contrast);
plot(contrast, per_correct_before, ‘k’, contrast, per_correct_after, ‘k--’);
ylabel(‘percent correct’)
legend(‘before training’, ‘after training’)
plot(contrast, dprime_before, ‘k’, contrast, dprime_after, ‘k--’);
%*****************************************%
function L=FitdvsPercent(dprime, correct, task);
% finds the d prime separation for a given percent correct
% uses maximum likelihood function minimization
bestper=dvPercent(dprime,task);
%*****************************************%
function bestper=dvPercent(dprime,task);
% finds the percent correct (assumining an optimal criterion etc.) for a given
% creates signal and noise distributions, assuming signal and noise
% have equal standard deviations of 1
% distributions are scaled by the standard deviation of the noise
sigS=1; %standard deviation of signal
sigN=1; %standard deviation of noise
x=linspace(-10, 10, 1000)./sigN;
%calculate the hit/false alarm rate
hit=1-NormalCumulative(x, dprime, sigS^2);
fa=1-NormalCumulative(x, 0, sigS^2);
if (task==‘YESN’) %yes-no
correctvals=hit+(1-fa); %assuming the criterion is that yes and no equally likely
beta=find(correctvals==max(correctvals));
elseif (task==‘2ALT’) %2-alt FC
bestper=sum((hit(1:length(x)-1)-hit(2:length(x))).*(1-fa(2:length(x))));
bestper=sum((hit(1:length(x)-1)-hit(2:length(x))).*((1-fa(2:length(x))).^2));
bestper=sum((hit(1:length(x)-1)-hit(2:length(x))).*((1-fa(2:length(x))).^3));
elseif (task==‘SMDF’)% same different
correct1=hit+(1-fa);%assuming the criterion is that same and different equally likely
beta=find(correct1==max(correct1));
bestper=2*(hit(beta))^2-2*hit(beta)+1;
%*****************************************%
function prob = NormalCumulative(x,u,var)
% function prob = NormalCumulative(x,u,var)
% Compute the probability that a draw from a N(u,var)
% distribution is less than x.
% Taken from the psychophysics toolbox
% http://www.psychtoolbox.org//
% 6/25/96 dhb Fixed for new erf convention.
z = (x − u*ones(m,n))/sqrt(var);