We investigate whether observers take into account their visual uncertainty in an optimal manner in a perceptual estimation task with explicit rewards and penalties for performance. Observers judged the mean orientation of a briefly presented texture consisting of a collection of line segments. The mean and, in some experiments, the variance of the distribution of line orientations changed from trial to trial. Subjects tried to maximize the number of points won in a “bet” on the mean texture orientation. They placed their bet by rotating a visual display that indicated two ranges of orientations: a reward region and a neighboring penalty region. Subjects won 100 points if the mean texture orientation fell within the reward region, and subjects lost points (0, 100, or 500, in separate blocks) if the mean orientation fell in the penalty region. We compared each subject's performance to a decision strategy that maximizes expected gain (MEG). For the nonzero-penalty conditions, this ideal strategy predicts subjects will adjust the payoff display to shift the center of the reward region away from the perceived mean texture orientation, putting the perceived mean orientation on the opposite side of the reward region from the penalty region. This shift is predicted to be larger for (1) larger penalties, (2) penalty regions located closer to the payoff region, and (3) larger stimulus variability. While some subjects' performance was nearly optimal, other subjects displayed a variety of suboptimal strategies when stimulus variability was high and changed unpredictably from trial to trial.

*estimation*(of depth, slant, location, orientation, etc.). In the absence of prior information or explicit consequences, subjects may attempt to maximize the percentage of correct responses, resulting in the adoption of a maximum-likelihood (ML) estimate to make optimum use of noisy sensory data (implicitly assuming a flat prior distribution). If one has prior information as to the probability of occurrence of different stimuli, one should combine that information with the sensory data (a Bayesian calculation) and use the maximum a posteriori (MAP) estimate (Kersten, Mamassian, & Yuille, 2004; Knill & Richards, 1996; Maloney, 2002; Mamassian, Landy, & Maloney, 2002). Finally, if there are known consequences (gains or losses) of different outcomes of the participant's decision or action, the optimal strategy (instead of optimizing percentage correct) is one that maximizes expected gain (MEG).

*lotteries*and asked which they would prefer. Each lottery consists of a set of mutually exclusive outcomes, their values (e.g., in monetary gains, or losses), and probability of occurrence. As an example, subjects might be asked to choose between lottery A: “You will receive $4 with probability .8, $0 otherwise,” and lottery B: “You will receive $3 for sure.” Participants often do not choose the lottery corresponding to MEG (Bell, Raiffa, & Tversky, 1988; Kahneman, Slovic, & Tversky, 1982; Kahneman & Tversky, 2000). These failures of the MEG model are often consistent with subjects having an exaggerated aversion to losses (Kahneman & Tversky, 1979) and exaggerating small probabilities (Allais, 1953; Attneave, 1953; Lichtenstein, Slovic, Fischhoff, Layman, & Coombs, 1978; Tversky & Kahneman, 1992).

*ϕ*is the line orientation,

*θ*

_{l}is the circular mean orientation,

*κ*

_{l}is the concentration parameter, and

*I*is the modified Bessel function. Note that this expression is slightly modified from the usual definition because the range of line orientation is from 0° to 180° while the usual formulation ranges from 0° to 360°. The concentration parameter

*κ*

_{l}is roughly analogous to inverse variance; the distribution is flat when

*κ*

_{l}= 0 and becomes narrower as

*κ*

_{l}increases. We will find it convenient to describe the distributions in terms of orientation variability or spread

*s*

_{l}= 1/

*κ*

_{l}. Figure 1 shows stimuli with spreads of 0.002 (Figure 1A), 0.02 (1B), and 0.2 (1C), spanning the entire range of

*s*

_{l}values used in this study. Treating this circular variable as if it were linear, these values of spread correspond to standard deviations of 1.3°, 4.1°, and 13.7°, respectively. On each trial, the mean orientation

*θ*

_{l}was chosen randomly and uniformly (from 0° to 180°). The manner in which

*s*

_{l}was chosen was different for each experiment and is described later.

*θ*

_{ l}fell within the reward range, but not within the penalty range. Subjects indicated they were satisfied with the setting by a key press. When

*θ*

_{ l}fell within the reward range (i.e., a line with orientation

*θ*

_{ l}through the center of the display intersected the white arcs as illustrated in Figure 2A), the subject was awarded 100 points. If

*θ*

_{ l}fell within the penalty range, 0, 100, or 500 points were deducted (the penalty value was fixed within a block of trials but varied across blocks). If

*θ*

_{ l}fell within both the reward and penalty ranges, the subject received both the reward and penalty. Subjects were asked to try to win as many points as possible, that is, to win rewards while avoiding penalties.

*θ*used to generate the stimulus, the orientation

*ψ*of the payoff display chosen by the subject, and the score for that trial. The payoff display orientation was coded as the orientation of the line joining the centers of the two white reward arcs (which passed through the center of the display).

*s*

_{ l}or decreased distance between the penalty and reward regions, subjects needed to rotate the payoff display to move the penalty further away from the mean stimulus orientation ( Figure 2B). We recorded the shift

*δ*of the setting

*ψ*away from the mean orientation

*θ*used to generate the texture. This shift was coded so that positive values indicate the subject set the orientation of the center of the penalty region on the opposite side of the reward region from the mean texture orientation (i.e., they “played it safe”). Thus, for the displays in Figures 1D and 1F,

*δ*=

*θ*−

*ψ,*whereas in Figures 1E and 1G,

*δ*=

*ψ*−

*θ*.

_{s}was calculated using the procedure of Schou (1978). Schou showed that his procedure (a marginal ML estimate) has lower bias than the ML procedure. The estimate of spread

_{s}shown in the figures is simply 1/

_{s}. This is

*not*the ML or marginal ML estimate of

*s*

_{s}, but bias should be low for the large number of trials contributing to each estimate.

*s*

_{ l}we used an estimate of

_{ s}(and

_{ s}) that was pooled over the six conditions (three penalty levels and two types of configuration, Near and Far). Note that results were always pooled over the two mirror-symmetric payoff displays by mirroring the data as discussed above in the definition of

*δ*. Typical tests for equality of variance are based on the assumption that the underlying distributions are normal and are known to be sensitive to failures of the normality assumption (Keppel, 1982). Therefore, we devised a resampling method (Efron & Tibshirani, 1993) to test for the equality of the spreads over the six conditions based on an analogy to Hartley's

*F*

_{max}statistic (Keppel, 1982). We calculated the range of the six observed

_{s}values (i.e.,

*F*

_{max}=

_{s,max}−

_{s,min}). Then, we simulated the experiment 1,000 times, assuming the pooled estimate of

_{s}to be correct, and computed

*F*

_{max}for each set of six

_{s}values. The

*p*value was estimated by determining the percentile of the observed

*F*

_{max}value in the distribution of simulated

*F*

_{max}values.

*S*that was generated based on line orientation distribution parameters

*θ*

_{ l}and

*s*

_{ l}, and one or both of these varied from trial to trial (with distributions that varied across the three experiments). The payoff displays in Figures 1D– 1G resulted in three regions with nonzero payoff:

*R*

_{1}(reward only),

*R*

_{2}(reward–penalty overlap), and

*R*

_{3}(penalty only). The gain associated with the reward region

*G*

_{ r}= 100 points; and the gain associated with the penalty region

*G*

_{ p}= 0, −100, or −500 points (depending on the block of trials).

*θ*

_{ l}; any estimate was corrupted by line segment orientation sample variability as well as any additional imprecision due to the observer's own sensory uncertainty or noisy calculations. The best the observer could do was, in each trial, to compute an estimate

_{ l}based on the stimulus and rotate the payoff display by an appropriate amount

*δ*away from that orientation

_{ l}. Similarly, the subject could not know the value of

*s*

_{ l}. The expected gain for any particular value

*δ*was

*δ*and θ

_{ l}in our task. The MEG strategy was to choose the value of

*δ*that maximized EG.

*θ*

_{ l}and

*s*

_{ l}. However, subjects' settings were far more variable than this would predict. This was likely due to imperfect calculation of the mean orientation of the stimulus, variability in the calculation of the shift, imperfect memory of the mean stimulus orientation, and/or variability due to imperfect adjustment of the response display. As one would expect, the spread of observer shift settings was a function of the spread of line orientations in the stimulus

*s*

_{ l}. Thus, we were interested in a MEG model of performance based on a subject hampered by the setting variability

_{ s}we estimated from the subjects' settings. Once we had an estimate of observer setting spread

_{ s}, we had sufficient information to calculate the MEG settings. That is, the MEG model was determined entirely by the data and had no free parameters.

*s*

_{ l}used to generate the stimulus, and hence also knew the setting spread

*s*

_{ s}. For this supra-ideal observer,

*δ*. Again, the MEG strategy was to use the value of

*δ*maximizing EG.

*s*

_{ l}(that it could not have known), it used an estimate based on the spread of the line orientations in the stimulus shown on that trial

_{ l}. Otherwise, the computations were identical (i.e., it used

_{ l}instead of

*s*

_{ l}to determine the value of

*s*

_{ s}in Equation 3). We call this model subideal because it effectively skipped the step of integrating over

*s*

_{ l}in Equation 2 using the estimate

_{ l}instead.

_{ s}based on that observer's data. This MEG observer model was simulated in the same experiment as the observer 1,000 times to estimate a 95% confidence interval for the range of efficiencies the MEG observer was likely to generate (an interval that, by definition, was centered on 1.0). An observer was deemed significantly suboptimal if efficiency fell below this range. Note that here and elsewhere we do

*not*perform Bonferroni corrections for multiple tests in computing error bars in each figure. Thus, the error bars reflect the variability of the simulations of that data point (i.e., they act like a typical standard error). Also, this leads to smaller error bars than would result from using the Bonferroni correction and thus makes it more likely that we will reject the null hypothesis that a subject is behaving optimally (i.e., in a sense, this is a conservative approach).

_{ s}(pooled over all penalty levels and Near/Far configurations for a given orientation spread and experiment; see the Circular statistics section) and the MEG strategy, and we determined a 95% confidence interval of MEG shift from the resulting distribution of mean shifts. The observer's mean shift value was deemed significantly different from the MEG strategy if it fell outside that interval (a two-tailed test). We also plot error bars on the individual shifts. These were computed in the same way, using the observer's mean shift rather than the MEG shift for the simulations.

*s*

_{ l}) were constant within a block. As shown below, subjects were highly efficient in terms of maximizing the number of points earned, although significantly suboptimal.

*s*

_{ l}values of 0.002 (least variable), 0.02, and 0.2 (most variable) that also varied across blocks. Each block consisted of 20 practice trials (5 repeats of each configuration in random order), which did not impact the subject's cumulative score, followed by 80 trials (20 repeats of each configuration) that did add to the total score. Subjects ran three

*s*

_{ l}= 0.002 blocks, then three

*s*

_{ l}= 0.02 blocks, and finally three

*s*

_{ l}= 0.2 blocks. Within each variability level, subjects ran one block with penalty 0, then one with penalty 100, and finally one with penalty 500. In other words, the blocks were run approximately in order from easiest to most difficult.

*κ*

_{ s}of the six distributions of setting shifts

*δ*corresponding to the different penalty values and the Near/Far configurations. For any given value of texture orientation variability, there was no pattern to the variation in the concentration parameters (i.e., subjects did not become more variable for more difficult conditions). The

*F*

_{max}statistic was significant (

*p*< .05) for only 3 of the 15 such tests (5 subjects, 3 levels of

*s*

_{ l}, no Bonferroni correction was applied). Thus, we feel justified in computing a single pooled estimate of setting spread

_{ s}(the reciprocal of the pooled estimate

_{ s}) for each subject and texture orientation variability

*s*

_{ l}. These pooled setting spread values are shown in Figure 3. Clearly, increased stimulus variability led to increased setting variability. Also note that the setting spreads are far higher than the spread of the sample circular mean orientation from the ca. 40 lines in each stimulus since the spread of the sample mean should be substantially smaller than the spread of the distribution used to select any single line orientation (i.e., the value on the abscissa). Thus, we are justified in including observer variability in the MEG model ( Equation 3).

*δ*) as a function of the shift predicted by the MEG model for a typical subject (all other individual subject data are shown in Supplementary Figure A1). The MEG predictions were based on the variability of each subjects' settings (the pooled

_{ s}values) estimated from the data with no free parameters ( Equation 3). Note that in most conditions, mean settings were not significantly different from MEG predictions (those that differed significantly are displayed with filled symbols). The horizontal and vertical lines on the plot represent the edge of the reward region (11° rotated from a zero shift, where zero represents the center of the reward region). The obvious suboptimal results (the two rightmost points) are conditions in which the MEG strategy required the subject to “aim” outside of the target region. All subjects were reticent to aim outside of the reward region when it was in their best interest to do so. This particular suboptimal strategy has also been noted in the reaching task (Trommershäuser et al., 2005).

Penalty | Near penalty | Far penalty | Overall | ||||||
---|---|---|---|---|---|---|---|---|---|

0 | 100 | 500 | Average | 0 | 100 | 500 | Average | ||

Spread | |||||||||

0.002 | 59.0 | 29.0 | 10.5 | 32.8 | 7.0 | 2.0 | 2.5 | 3.8 | 18.3 |

0.02 | 57.5 | 21.5 | 13.5 | 30.8 | 7.0 | 4.0 | 3.5 | 4.8 | 17.8 |

0.2 | 48.5 | 26.0 | 16.5 | 30.3 | 16.5 | 11.0 | 13.0 | 13.5 | 21.9 |

Average | 55.0 | 25.5 | 13.5 | 31.3 | 10.2 | 5.7 | 6.3 | 7.4 | 19.4 |

*κ*

_{ s}values. This time there was some hint that setting spreads increased with task difficulty, but the

*F*

_{max}test was significant for only 5 tests out of 18 (six subjects, three texture orientation spreads, no Bonferroni correction was applied). Results for a typical subject are shown in Figures 5A– 5B in the same format as Figure 4 (all other individual subject data are shown in Supplementary Figure A2). The mixed-block design of Experiment 2 presented no additional difficulty for this subject. The proportion of penalties awarded in each condition is shown in Table 2 and is similar to the results in Experiment 1. Efficiency values are shown in Figures 5C– 5D in the same format as Figure 4. Again, all subjects were optimal in the zero-penalty conditions. For most subjects, efficiency was also high in the nonzero-penalty conditions ( Figure 5D), except for the most variable subject (DG).

Penalty | Near penalty | Far penalty | Overall | ||||||
---|---|---|---|---|---|---|---|---|---|

0 | 100 | 500 | Average | 0 | 100 | 500 | Average | ||

Spread | |||||||||

0.002 | 54.6 | 24.2 | 10.0 | 29.6 | 9.6 | 3.8 | 3.8 | 5.7 | 17.6 |

0.02 | 53.8 | 25.8 | 14.6 | 31.4 | 11.2 | 5.4 | 5.8 | 7.5 | 19.4 |

0.2 | 52.1 | 24.2 | 17.1 | 31.1 | 22.1 | 14.2 | 6.7 | 14.3 | 22.7 |

Average | 53.5 | 24.7 | 13.9 | 30.7 | 14.3 | 7.8 | 5.4 | 9.2 | 19.9 |

_{ s}was approximately a linear function of line orientation spread

*s*

_{ l}. Therefore, in the experiment the line orientation spread

*s*

_{ l}was chosen uniformly over the range [0.002, 0.2]. This procedure led to a fairly uniform distribution of difficulty, and hence of optimal shift. The reward value was again 100 points; the penalty was 0, 100, or 500 points, varied between blocks. Each block consisted of 24 practice trials (6 repeats of each condition in random order), which did not impact the subject's cumulative score, followed by 120 trials (30 repeats of each condition) that did add to the total score. Most subjects ran three full practice blocks with penalty 0, followed by 12 blocks with penalty taking the values 0, 100, and 500, repeated in sequence. Subject MSL ran 9 scored blocks.

Penalty | Near penalty | Far penalty | Overall | ||||||
---|---|---|---|---|---|---|---|---|---|

0 | 100 | 500 | Average | 0 | 100 | 500 | Average | ||

Subject | |||||||||

AEW | 46.2 | 48.3 | 32.1 | 42.2 | 18.8 | 17.1 | 20.4 | 18.8 | 30.5 |

AKK | 51.7 | 28.3 | 8.3 | 29.4 | 24.2 | 12.1 | 8.3 | 14.9 | 22.2 |

AT | 47.5 | 26.7 | 0.8 | 25.0 | 17.1 | 13.8 | 6.2 | 12.4 | 18.7 |

AVP | 52.1 | 52.1 | 14.2 | 39.4 | 8.8 | 12.5 | 10.4 | 10.6 | 25.0 |

MHF | 44.2 | 36.2 | 15.4 | 31.9 | 20.4 | 13.3 | 7.9 | 13.9 | 22.9 |

MMC | 47.1 | 36.7 | 9.6 | 31.1 | 14.6 | 9.6 | 3.8 | 9.3 | 20.2 |

MSL | 56.1 | 30.6 | 9.4 | 32.0 | 12.2 | 8.3 | 4.4 | 8.3 | 20.2 |

SF | 31.2 | 30.8 | 21.2 | 27.8 | 17.5 | 12.5 | 15.8 | 15.3 | 21.5 |

SMN | 45.8 | 39.6 | 48.4 | 44.7 | 23.8 | 30.8 | 25.4 | 26.7 | 35.7 |

ST | 46.2 | 35.8 | 19.6 | 33.9 | 21.7 | 19.6 | 19.6 | 20.3 | 27.1 |

Average | 46.6 | 36.7 | 18.2 | 33.8 | 18.0 | 15.1 | 12.4 | 15.2 | 24.5 |

*y*-axis scales are the same for all plots for any given subjects, but differ across subjects.

*y*-axis scales are the same for all plots for any given subject, but differ across subjects.