We exploit this constraint by requiring that the posterior probability of
ω =
same be a monotonic function of the decision variable. Let
(
s) represent an estimate of the optimal decision function
z(
s). Here, the exemplar
g(
s,
w) and quadratic
q(
s,
A,
b,
c) decision functions can be regarded as estimates of
z(
s). For any given training data set consisting of
N samples, we compute
(
s) for each sample and then sort these
(
s) values into quantiles. By definition, the quantiles contain equal numbers of samples. The
j th quantile will then contain
n j,same samples that are actually in category
same and
n j,diff samples that are actually in category
different. This provides an estimate
(
ω =
same∣
j) of the posterior probability of category
same for each quantile of the decision variable:
The quantity
(
ω =
same∣
j) is a noisy estimate of the true posterior probability because it is subject to the effects of sampling noise and systematic errors in the estimated decision function
(
s). For example, the blue curve in
Figure 5 illustrates the kind of non-monotonic relationship expected due to sampling noise. The blue curve was obtained by applying the optimal quadratic decision function to 10,000 random samples from Gaussian stimulus distributions and then binning the decision values into 200 quantiles to obtain
(
ω =
same∣
j). This number of samples is representative of our training data sets. The thick black curve shows the actual posterior probabilities that would be obtained with infinite sample size. The sampling noise apparent in the blue curve can make it difficult to search the parameter space of the decision function. To reduce the effects of sampling noise we enforce the non-decreasing constraint by finding the best fitting monotonic function
f(
ω =
same∣
j) through
(
ω =
same∣
j) by using a monotonic regression algorithm. This is illustrated by the red curve in
Figure 5, which is much closer to black curve (the actual posterior probabilities) than to the blue curve.