Free
Research Article  |   January 2006
Limits to human movement planning in tasks with asymmetric gain landscapes
Author Affiliations
Journal of Vision January 2006, Vol.6, 5. doi:10.1167/6.1.5
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to Subscribers Only
      Sign In or Create an Account ×
    • Get Citation

      Shih-Wei Wu, Julia Trommershäuser, Laurence T. Maloney, Michael S. Landy; Limits to human movement planning in tasks with asymmetric gain landscapes. Journal of Vision 2006;6(1):5. doi: 10.1167/6.1.5.

      Download citation file:


      © 2015 Association for Research in Vision and Ophthalmology.

      ×
  • Supplements
Abstract

We studied human movement planning in a task with predefined costs and benefits to movement outcome. Participants pointed rapidly at stimulus configurations consisting of a target region and up to two penalty regions. Hits on the target and penalty regions resulted in monetary gains and losses. In previous studies involving single penalty regions or other symmetric target-penalty configurations, performance was optimal in the sense of maximizing expected gain. In this study, more complex, asymmetric configurations were used in which the two penalty regions carried different penalties. With these configurations, the landscape of expected gain as a function of mean end point (MEP) was spatially asymmetric. Further, the optimal movement plan with these configurations was sometimes counterintuitive (e.g., one should aim slightly inside the lesser penalty region). In one asymmetric condition, four out of six naïve participants' performed suboptimally, indicating that there are limits to human movement planning. Further, the suboptimal performance was inconsistent with a model in which participants misestimate motor variability but otherwise optimally plan their movement.

Introduction
Traditionally, visually guided movement planning is studied in experiments in which subjects perform goal-directed movements towards a visually specified target (e.g., Körding & Wolpert, 2004; Saunders & Knill, 2004; Todorov & Jordan, 2002). However, in planning a movement, the motor system goes beyond directing the arm towards a visually specified target. In selecting a movement plan, it also takes into account the consequences of possible motor errors. Trommershäuser, Maloney, & Landy (2003a, 2003b) examined movement planning in a simple task in which subjects attempted to touch a briefly displayed, green target region while avoiding one or more nearby, overlapping, red penalty regions (Figure 1A). If the subject touched within the reward region, he or she earned points and, if he or she hit within a penalty region, he or she lost points. In the configuration illustrated in Figure 1A, for example, subjects scored 10 points for hits in the reward region and lost 20 points for hits in the penalty region. If multiple regions were hit, subjects earned the sum of the points associated with those regions, and if the subject hit outside the penalty and reward regions, no points were awarded. If the subject failed to hit the screen within 700 ms, a large “time-out” penalty was imposed. Trommershäuser et al. (2003a, 2003b) varied the number of points associated with a penalty region. The subjects were informed of the rewards and penalties associated with the different regions on each trial and that the total number of points earned across trials would be redeemed for a proportional amount of money at the end of the experiment. 
Figure 1
 
A stimulus configuration and its expected gain landscape. (A) Stimulus configuration from Trommershäuser et al. (2003a). The reward and penalty associated with hitting within each region are shown. (B) The expected gain landscape for the stimulus configuration in panel A for a subject with motor variability σ = 5.75 mm. The scale on the right specifies the expected gain per trial for different MEPs. The MEG point is marked by an orange diamond.
Figure 1
 
A stimulus configuration and its expected gain landscape. (A) Stimulus configuration from Trommershäuser et al. (2003a). The reward and penalty associated with hitting within each region are shown. (B) The expected gain landscape for the stimulus configuration in panel A for a subject with motor variability σ = 5.75 mm. The scale on the right specifies the expected gain per trial for different MEPs. The MEG point is marked by an orange diamond.
The challenge for a subject in these experiments was to plan movements that struck an appropriate balance between the goal of hitting the target and the goal of avoiding the penalty region. The key difficulty in planning was the stochastic nature of the subject's own movements. Whatever point on the screen a subject might plan to hit, the actual point hit might differ due to unavoidable motor error. For the example in Figure 1A, the consequences of any movement plan or strategy s for the subject can be characterized by the probabilities pG(s) of hitting the green target region and pR(s) of hitting the red penalty region. The expected gain1 of the movement plan s is then, ignoring the rare time-out penalties,  
EG(s)=pG(s)GG+pR(s)GR,
(1)
where GG is the reward associated with the green target region and GR is the loss associated with the red penalty region. 
Trommershäuser et al. (2003a, 2003b) found that, in the context of their experiments, the probabilities of reward or penalty depended only on the mean end point (MEP) of the movement strategy s, MEP(xs, ys), that a subject adopted. They plotted EG(s) versus (xs, ys) to form an expected gain landscape as shown in the contour plot of Figure 1B. MEPs deep within the penalty circle are associated with low expected gains (dark regions) whereas MEPs somewhat to the right of the center of the green circle correspond to the highest expected gain possible. The MEP (xs, ys) marked with an orange diamond corresponds to the movement strategy s* that maximizes expected gain. 
The expected gain landscape and maximum expected gain (MEG) strategy depended on the subject's own motor variability. Trommershäuser et al. (2003a, 2003b) found that end point variability was well characterized as an isotropic bivariate Gaussian distribution with a standard deviation σ that was specific to the subject but independent of MEP. Figure 1B shows the expected gain landscape for a subject with σ = 5.75 mm, a motor variability at the high end of the human range in these tasks. 
Trommershäuser et al. (2003a, 2003b) found that, for almost all subjects, human performance could not be discriminated from that of a MEG movement planner with the same motor variability σ as the subject. Human movement planning in the tasks they considered came close to maximizing expected gain. Subjects' winnings were typically 90% or more of the MEG. 
However, these experiments share one feature that may have helped subjects to perform so well. Across all stimulus configurations, the MEPs corresponding to optimal performance all fell on an evident axis of geometric symmetry (the dashed line in Figure 1). While it is still remarkable that subjects could determine which MEP on this axis led to MEG, it is clear that the presence of the axis of symmetry could have greatly reduced the difficulty of these tasks. 
If subjects do use the axis of symmetry in this way, their strategy can be thought of as a “motor heuristic.” The study of heuristics in human reasoning and decision making can be traced back to Simon (1957), who first argued that “a great deal of rational decision making can be learned by taking into account the limitations upon the capacities and complexity of the organism,…and the fact that the environments to which it must adapt possess properties that permit further simplification of its choice mechanisms” (Simon, 1957). Indeed, most of the research on cognitive judgment and decision making that followed focused on how humans judge and make decisions by deriving heuristics and how errors and biases could arise from using them (Kahneman, Slovic, & Tversky, 1982; Gigerenzer, Todd, & the ABC Research Group, 1999). Our conjecture is, should the subjects rely on the symmetry axis to simplify the planning problem, their performance would drop if they continued to rely on this heuristic in a different context where optimal end point no longer lies on the axis. 
The present study seeks to examine how closely human performance matches optimal performance in tasks where configurations have an evident axis of geometric symmetry but where the optimal MEP does not always lie on or near this axis of symmetry. We do not expect human performance to be exactly optimal but, following Geisler (1989), we can assess how close human subjects come to optimal and gain insight into visual and motor decision processes by comparing actual performance to a model of optimal performance. In particular, a subject who relies on geometric symmetry to constrain the search for the optimal MEP will not do well in our tasks. 
The stimulus configurations we employed consisted of up to three overlapping regions having distinct monetary gains: a reward region, a lower penalty (LP) region, and a higher penalty (HP) region. Two examples with three regions each are shown in Figures 2A and C. We refer to these configurations as 2P-A and 2P-B, respectively. These rewards and penalties are specified in points for convenience, but subjects were aware that the total of points that they won would be converted to a monetary reward at the end of the experiment. 
Figure 2
 
Asymmetric reward structures. (A) Two-penalty stimulus configuration 2P-A and its reward structure. The configuration was still geometrically symmetric but the penalties associated with the two penalty regions differ. (B) The resulting expected gain landscape and MEG point based on a motor variability σ = 5.75 mm, the same value as was used in preparing Figure 1B. Note that the MEG point is shifted away from the symmetry line, but only slightly. (C) Two-penalty stimulus configuration 2P-B and its reward structure. (D) The resulting expected gain landscape and MEG point based on the same σ used for Figures 1B and 2B. Note that the MEG point is markedly shifted away from the symmetry line and inside the blue penalty region.
Figure 2
 
Asymmetric reward structures. (A) Two-penalty stimulus configuration 2P-A and its reward structure. The configuration was still geometrically symmetric but the penalties associated with the two penalty regions differ. (B) The resulting expected gain landscape and MEG point based on a motor variability σ = 5.75 mm, the same value as was used in preparing Figure 1B. Note that the MEG point is shifted away from the symmetry line, but only slightly. (C) Two-penalty stimulus configuration 2P-B and its reward structure. (D) The resulting expected gain landscape and MEG point based on the same σ used for Figures 1B and 2B. Note that the MEG point is markedly shifted away from the symmetry line and inside the blue penalty region.
Note that the configurations in Figures 2A and C are geometrically symmetric (if we ignore color), just as were the stimuli of Trommershäuser et al. (2003a, 2003b), but the differences in the penalties make the corresponding expected gain landscapes in Figures 2B and D asymmetric, markedly so in Figure 2D, less so in Figure 2B
Some combinations of configuration and gain lead to MEG points that might appear counterintuitive. For example, for configuration 2P-B, the optimal MEP for many subjects (marked with an orange diamond) lies far from the axis of symmetry, and very close to or even within one of the penalty regions. For configuration 2P-A, on the other hand, the optimal MEP lies only a short distance away from the axis of symmetry. We examined whether humans are able to achieve optimal performance even when faced with asymmetric expected gain landscapes. 
Methods
Apparatus
The experimental setup was the same as used previously (Trommershäuser et al., 2003a, 2003b). Each subject was seated in front of a transparent touchscreen (AccuTouch from Elo TouchSystems, accuracy <±2 mm standard deviation, resolution of 15,500 touch points/cm2), mounted vertically in front of a 21-in. computer monitor (Sony Multiscan CPD-G500, 1280 × 1024 pixels at 75 Hz). A chin rest was used to control the viewing distance, which was 30 cm in front of the touchscreen. The computer keyboard was mounted on the table centered in front of the monitor. The experimental room was dimly lit. The experiment was run using the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) on a Pentium III Dell Precision workstation. At the beginning of every experimental session, a touch-calibration procedure was performed to ensure that the touchscreen measurements were geometrically aligned with the visual stimuli. 
Stimuli
The stimulus configuration consisted of equal-sized circular regions (9 mm radius), partially overlapping one another and carrying different payoffs. A green target region (40-point reward) was always present, while the number (from zero to two) and color of the accompanying penalty regions were randomly varied across trials. The color codes for penalty regions were consistent throughout the experiment: red always indicated the higher penalty (−50 points), whereas blue represented the smaller penalty (−20 or −10 points). The stimulus configuration always appeared within a stimulus presentation area marked by a blue rectangle on the screen and was shifted by a random x- and y-offset on each trial. Because of these random shifts, subjects could not simply execute the same preplanned movement on each trial. 
There were two experimental sessions, A and B. The possible stimulus configurations for each session are shown in Figure 3. Figure 3A shows the configurations for Session A including 2P-A and three sub-configurations consisting of the reward region only (RO), the higher penalty region together with the reward region (HP-A), and the lower penalty region together with the reward region (LP-A). The 2P-A, HP-A, and LP-A configurations could appear rotated at random by 0, 90, 180, or 270 deg. The stimuli for Session B were based on configuration 2P-B in an analogous fashion (Figure 3B). 
Figure 3
 
Stimulus configurations. (A) The stimulus configurations for Session A consisted of the two-penalty configuration (2P-A) and three sub-configurations: the high-penalty area and the reward area (HP-A), the low-penalty region and the reward region (LP-A), and the reward region only (RO). On each trial, the subject was presented with one of these four configurations rotated by 0, 90, 180, or 270 deg chosen randomly. (B) The stimulus configurations for Session B were based on the 2P-B configuration in a similar fashion.
Figure 3
 
Stimulus configurations. (A) The stimulus configurations for Session A consisted of the two-penalty configuration (2P-A) and three sub-configurations: the high-penalty area and the reward area (HP-A), the low-penalty region and the reward region (LP-A), and the reward region only (RO). On each trial, the subject was presented with one of these four configurations rotated by 0, 90, 180, or 270 deg chosen randomly. (B) The stimulus configurations for Session B were based on the 2P-B configuration in a similar fashion.
Procedure
The order of sessions was counterbalanced across subjects. There were 13 possible configurations (3 penalty configurations × 4 orientations and the reward-only configuration). A block of 26 trials consisted of these 13 possible configurations repeated twice in random order. A session comprised 12 blocks (312 trials). A different random order of trials was used for each block. 
The number of points corresponding to each penalty and reward region was indicated to the subject before the beginning of a session and remained constant during the session. The reward for hitting the target region was always 40 points, whereas the red penalty region always cost the subject 50 points. The penalty for the blue region was 20 points in Session A and 10 points in Session B. Subjects were told that they would receive 50 cents for every 1000 points they earned as a monetary bonus at the end of the experiment. 
Subjects started the experiment by performing a touch-calibration procedure. After calibration, an instruction describing the experimental condition appeared on the screen. Subjects started each trial by fixating a cross at the center of the screen. Next, a blue frame was shown, indicating the area within which the stimulus configuration would appear. Subjects were instructed to depress the space bar with their index finger until the stimulus configuration appeared. Precisely 500 ms after the blue frame was displayed, the stimulus configuration appeared. The amount of payoff was determined by the touch point relative to the regions in the configuration. Immediately following each trial, the subject received feedback for his or her performance on that trial. The reward and penalty regions that the subject touched would explode graphically and the subject was given a numerical summary of the net earnings on that trial. The subject then saw a summary of his or her total earnings so far in the experiment. 
In addition, if the subject failed to touch the screen within 700 ms after the stimulus presentation, he or she incurred a high time-out penalty (−300 points). If the subject responded before or within 100 ms after the configuration appeared (“a fast guess”), the results of the trial were discarded and the same configuration was presented again later in the same block. 
Subjects and instructions
Six graduate students (three male, three female) participated in the experiment. All subjects gave informed consent before the experiment and were paid $12/hr plus any performance bonus (as described above) for their participation. Participants were all right-handed and were instructed to use their right index finger to perform the experiment. All were unaware of the purpose of the experiment. 
Prior to the experiment, subjects were given a general description of the experiment. Each subject ran one practice session and the two experimental sessions (20 warm-up trials, 12 blocks of 26 trials) on three separate but consecutive days. The practice session was the same as in Trommershäuser et al. (2003b) in which only one penalty region was present in the stimulus configurations. Subjects did not face the configurations with two penalty regions until after the warm-up trials prior to the start of the first experimental session. The warm-up configurations differed from those in the experimental sessions and were adopted from Trommershäuser et al. (2003b, Experiment 2). The order of the experimental sessions was balanced across subjects. During each session, subjects were informed of the monetary payoffs before the onset of each block. Each session took approximately 35–40 min to complete. Performance bonuses ranged from $6 to $10. 
Data analysis
For each trial, we recorded the reaction time (time interval from the stimulus display to the release of the space bar) and the movement time (time interval from the release of the space bar to touching the screen). We also recorded the movement end point (the screen position that was hit) and the score (in points). Trials in which subjects released the space bar earlier than 100 ms or later than 700 ms after stimulus presentation were excluded from the analysis. 
Reaction time and movement time
Reaction time and movement time were analyzed individually for each subject as a one-factor repeated measures ANOVA. The factor was the configuration condition (2P-A, HP-A, etc.). 
Overall pointing bias
Overall pointing bias was estimated and corrected as in previous experiments (Trommershäuser et al., 2003a, 2003b). For each subject, we examined whether there was an overall pointing bias independent of conditions and removed this overall bias from end point coordinates. For each session, the bias was estimated by averaging the movement end points (relative to the center of the target) across all stimulus configurations. Any end point shifts due to observer strategy should average out given the inclusion of stimulus configurations differing by a 180 deg rotation. These estimates of overall bias ranged from 0.13 to 2.15 mm across subjects. These biases likely reflect a difference between where the touchscreen recorded each hit and what the subject considered to be the exact point of contact within the finger pad (Trommershäuser et al., 2003b). We emphasize that these small biases were only corrected in the data analysis following the experiment, not during the experiment. On each trial, the subject received the rewards or penalties associated with the point on the screen that he or she touched, without correction for his overall biases. 
Movement end point and score
Once the end points were corrected, the score used for the analysis was adjusted based on the corrected end points. 
Motor variability
For each subject, we tested whether, for each experimental session, the variance was the same in the horizontal (x) and vertical (y) directions. We then compared the variances (pooled across the x- and y-directions) between sessions. Variances in the x- and y-direction were equal to one another within sessions for five out of six subjects (generalized likelihood ratio test, p > .05 for five subjects; Mood, Graybill, & Boes, 1974). Among them, three exhibited consistent variability across the two sessions and therefore a pooled estimate of cross-session variance was computed for these subjects. For the two subjects showing different variability between sessions, we kept separate variance estimates for each session. For the subject showing different variability in the x- and y-direction, we found that both the variability in the x- and y-directions were consistent across sessions. For this subject, we computed estimates of variance in the x- and y-directions separately, pooled across sessions. 
MEG predictions and efficiency
Given these estimates of motor variability for each subject, we computed the MEP that led to the MEG point. Each subject's average score in each risk condition was divided by the maximal expected gain to compute efficiency. For each subject, we computed the confidence intervals of optimal efficiency separately at the 95% level and after Bonferroni correction for multiple tests using a Bootstrap method (Efron & Tibshirani, 1993) in which 100,000 runs of each condition were simulated. 
Deviation of MEPs from MEG points
For each subject and each 2P configuration, we tested whether the MEP was statistically different from its MEG point on the 2-D plane. For this purpose, we computed the Hotelling T2 statistic (Manly, 2005), a standard multivariate test for the difference in two means, analogous to Student's t test. We compared computed values of T2 to the appropriate cutoff after Bonferroni correction for the number of subjects and sessions run in the experiment. The overall Type I Error rate was 0.05.2 
Results
Prior to analyzing the results, we excluded (1) time-out trials; (2) outliers, that is, end points that were located more than three times the radius (9 × 3 = 27 mm) away from the target center; (3) trials with reaction times less than 100 ms, indicating that subjects did not wait for the stimulus to appear before initiating movement. We excluded 32 (20 time-out trials, 12 outliers) out of a total of 3744 trials across all subjects by these criteria (less than 1%) and analyzed the remaining 3712 trials. 
In each session, the subject encountered four types of configurations, 2P, HP, LP, and reward-only. Each served a separate purpose. The one-penalty configurations served as replications of Trommershäuser et al. (2003a, 2003b). Based on the results of these trials, we could determine whether our subjects were comparable in efficiency to theirs. The two-penalty configurations allowed us to test whether subjects could plan movements to maximize expected gain with stimuli that had an asymmetric expected gain landscape. Note that all stimuli were geometrically symmetric (reward-only, HP, LP, and 2P, ignoring the color of the circles). 
Reaction time and movement time
Unsurprisingly, reaction time and movement time were significantly different across subjects, F(5,3706) = 1271.05, p < .001, and F(5,3706) = 1959.82, p < .001, for reaction time and movement time, respectively. Reaction time did not differ across stimulus configurations for any subject or session. Movement time, however, exhibited differences between configurations in both sessions. In Session A, four out of six subjects showed significant differences across the four configurations (p < .0042, Bonferroni corrected for six subjects and two sessions), whereas five subjects had different movement times across the configurations in Session B. Across sessions, three subjects (MF, LB, and KD) consistently spent less movement time (a difference of 40 ms or less) on the reward-only trials than the penalty trials (2P, HP, LP) whereas subject CH spent 10 ms more time on the reward-only trials. 
Model comparison
Two-penalty configurations
Across all subjects, the deviations of the MEP from the MEG point were larger in Session B than A (Figures 4A and B). The error bar in the figures represented the 95% confidence interval of the MEP. In Session A, the Hotelling T2 revealed that only one (subject CH) out of six subjects' MEP was significantly different from the predicted MEG point (p < .0042, Bonferroni corrected for number of subjects and number of sessions). In Session B, four of the six subjects deviated significantly from the model predictions (subjects JC, LB, CH, and KD). When we further compared the pattern of deviations between A and B, we found that in Session B, all but one subject (LB) deviated away from the predicted MEG point toward the target center and/or toward the high-penalty region (Figure 4B). On the contrary, we did not observe such a pattern in Session A except for subject CH. 
Figure 4
 
MEPs and MEG predictions. (A) Stimulus configuration 2P-A. The orange diamond indicates the computed MEG point for each subject. It depends upon the subject's motor variability σ, which is shown. Subjects' MEPs are plotted with error bars indicating the 95% confidence interval. They are generally close to the computed MEG point. (B) Stimulus configuration 2P-B. The format is the same as in panel A. For four out of six of the subjects, the MEP is displaced away from the computed MEG points toward the center of the target and/or toward the high-penalty region. The pooled σ values are indicated for each subject.
Figure 4
 
MEPs and MEG predictions. (A) Stimulus configuration 2P-A. The orange diamond indicates the computed MEG point for each subject. It depends upon the subject's motor variability σ, which is shown. Subjects' MEPs are plotted with error bars indicating the 95% confidence interval. They are generally close to the computed MEG point. (B) Stimulus configuration 2P-B. The format is the same as in panel A. For four out of six of the subjects, the MEP is displaced away from the computed MEG points toward the center of the target and/or toward the high-penalty region. The pooled σ values are indicated for each subject.
Subjects' earnings in the two-penalty conditions are shown in Figure 5A as a percentage of the MEG. These estimated efficiencies are random variables and an optimal MEG movement planner with a true efficiency of 100% is expected to have estimated efficiencies both less than and greater than 100%. In Session A, efficiencies were near optimal, within 6% of optimal for five out of six subjects. In Session B, subjects typically had lower efficiencies but, despite the significant deviations from predicted MEG points, five out of six subjects gained 80% or more of their predicted MEG (exception: subject CH). Subjects' deviations were still “small” in the sense that they did not cost subjects much in earnings. Examination of the expected gain landscape (Figure 2D) confirms that small deviations away from the MEG point toward the high-penalty region are not too costly for a subject with σ = 5.75 mm. Recall that the expected gain landscape and the MEG point depend on subjects' motor variability and differ slightly from subject to subject. Overall, the results indicate that, when the expected gain landscapes were close to symmetric (Session A), subjects' MEPs were close to their predicted MEG points and their earnings were close to optimal. In Session B, however, where the expected gain landscape was markedly asymmetric, the majority of subjects chose MEPs that were markedly closer to the target center and/or red penalty region than the predicted MEG point. 
Figure 5
 
Efficiencies. (A) Two-penalty configurations. The amount of money each subject won is shown divided by the MEG possible for an optimal subject with the same motor variability. The lower contour of the darker shaded region marks the lower limit of a 95% confidence interval for the optimal (MEG) movement planner. The lower contour of the lighter shaded region marks the lower limit with a Bonferroni correction for multiple tests. (B) One-penalty configurations.
Figure 5
 
Efficiencies. (A) Two-penalty configurations. The amount of money each subject won is shown divided by the MEG possible for an optimal subject with the same motor variability. The lower contour of the darker shaded region marks the lower limit of a 95% confidence interval for the optimal (MEG) movement planner. The lower contour of the lighter shaded region marks the lower limit with a Bonferroni correction for multiple tests. (B) One-penalty configurations.
One-penalty configurations
The one-penalty configurations were always geometrically and gain symmetric and were replications of similar conditions used by Trommershäuser et al. (2003a, 2003b). There are two one-penalty configurations for each of the two-penalty configurations, one consisting of the high-penalty region and the target region (HP), the other consisting of the low-penalty region and the target circle (LP). We report the efficiencies of all observers in the HP and LP conditions in Figure 5B. With the exception of subject MF in Session B, efficiencies were close to 100% and resemble those found by Trommershäuser et al. 
The equivalent σ hypothesis
In Session B, MEPs seemed to be consistently displaced away from the MEG point toward the target center (or, alternatively, toward the high-penalty region). In the one-penalty conditions, we observed similar tendencies of subjects to hit slightly closer to the target center and the penalty region in the higher-penalty condition but not in the lower-penalty condition. To explain these outcomes, we consider a hypothesis that we refer to as the equivalent σ hypothesis. 
According to the MEG model (Trommershäuser et al., 2003a), given a known stimulus configuration, the optimal shift in mean movement end point from the target center to the MEG point is determined by the subject's own motor variability σ. However, the true motor variability σ, which determines the MEG point, may differ from the motor variability estimated by the experimenter (
σ
) and also from the subject's estimate of his or her own motor variability,
σ˜
. The experimenter has access only to
σ
, the subject has access only to
σ˜
Suppose that the subject acts so as to achieve MEG in every respect, but chooses a movement plan based on
σ˜
. If
σσ˜
, the experimenter will score the subject's behavior (optimal for
σ˜
) as suboptimal for
σ
. It may be that the experimenter, the subject, or both have estimated σ incorrectly. We next consider whether the pattern of small failures across all the conditions of the experiment is consistent with the claim that the subject's motor system computes a MEG solution, but with an estimate
σ˜
that differs from the experimenter's estimate
σ
Figure 6 shows the two-penalty configurations (2P-A and 2P-B) and the one-penalty configurations (HP collapsed across Sessions A and B, and LP-B). Imposed on each is a short contour that indicates the MEG point for values of σ ranging from 1 mm (smaller than any value measured for any of our subjects) to 7 mm (larger than any experimentally observed value of σ) and the subjects' MEPs (closed circles). For every subject with isotropic motor variability in this range, the MEG point falls somewhere on the contours. For any value of σ, there is one MEG point on the contour in each of the four configurations. If the equivalent σ hypothesis holds for a subject, then the subject's MEPs in each condition will not be significantly different from MEG points corresponding to a single value of σ
Figure 6
 
The effect of σ on the MEG end point. (A) Configuration 2P-A. (B) Configuration 2P-B. The orange line shows the locus of MEG points for different values of motor variability σ ranging from 1 to 7 mm (this range includes the motor variabilities of all of the subjects in the experiment reported here and also those for subjects reported by (Trommershäuser et al., 2003a, 2003b). The number adjacent to each tick mark indicates its σ value. The MEP for any optimal (MEG) movement planner (with motor variability in that range) falls somewhere along this locus. (C) Configurations HP-A and HP-B. (D) Configuration LP-B.
Figure 6
 
The effect of σ on the MEG end point. (A) Configuration 2P-A. (B) Configuration 2P-B. The orange line shows the locus of MEG points for different values of motor variability σ ranging from 1 to 7 mm (this range includes the motor variabilities of all of the subjects in the experiment reported here and also those for subjects reported by (Trommershäuser et al., 2003a, 2003b). The number adjacent to each tick mark indicates its σ value. The MEP for any optimal (MEG) movement planner (with motor variability in that range) falls somewhere along this locus. (C) Configurations HP-A and HP-B. (D) Configuration LP-B.
In Figures 6A and D, most MEPs fall relatively close to the contour, potentially consistent with the hypothesis. In the HP conditions (Figure 6C), most MEPs lie to the left of the contour, consistent with the hypothesis but for an estimate of σ less than 1 or, equivalently, observers treated the penalty as lower than −50 in this condition. However, all MEPs in the 2P-B condition (Figure 6B) lie to the left of the contour, inconsistent with the hypothesis. We can reject the equivalent σ hypothesis without further test. For each subject, no single choice of σ is consistent with the results. 
Utility versus gain
We considered the possibility that subjects may be maximizing expected utility (MEU) rather than expected gain, where utility is a nonlinear function of gain (Bernoulli, 1738/1954). For example, if subjects treated the −50 penalty as only three or four times greater than the −10 penalty, then their MEU points would be closer to the high-penalty (red) circle than the MEG point, as was generally found in condition 2P-B. We reject this explanation for three reasons. First, these experiments are very similar to those reported by Trommershäuser et al. (2003b) where there was no evidence that large penalties were underweighted. Second, if gains were recoded to produce the changes observed in 2P-B, we would expect the same recoding of gain to be effective in the other conditions. We have searched extensively for remappings of penalties that would simultaneously produce the patterned shifts of 2P-B but no patterned shift in 2P-A and have found none. Finally, under the conditions of this experiment (many trials for small amounts with clear accumulation of winnings), we do not expect deviations from MEG predictions given past results in the decision making literature (see Maloney, Trommershäuser, & Landy, in press). 
Discussion
We previously presented a model (Trommershäuser et al., 2003a) for the planning of motor responses in environments where there are explicit gains and losses associated with the outcomes of actions. The goal of motor planning is the selection of a movement plan, an algorithm that, when executed, initiates and guides a movement. The result of executing a movement plan is a motor trajectory, but the choice of plan does not completely determine the resulting trajectory: the outcome is in part stochastic. The gain or loss incurred by any movement is determined by the actual, not the planned motor trajectory. 
In Trommershäuser et al. (2003a, 2003b), human performance in rapid pointing tasks was compared to the performance of an ideal movement planner that planned movements to maximize expected gain. In those experiments, subjects were presented with stimulus configurations consisting of two or three overlapping circular disks on a computer monitor. The reward or penalty associated with each disk was coded by color. Subjects attempted to earn money by rapidly touching within the stimulus configuration, avoiding penalty regions, and hitting reward regions. 
Trommershäuser et al. (2003a, 2003b) found that human performance in those tasks was close to optimal. However, all of the stimulus configurations that they employed had an evident axis of symmetry. The expected gain associated with each possible end point in the display formed an expected gain landscape that was symmetric about the axis of geometric symmetry of the configuration. In all cases, the optimal choice of MEP fell on that axis. 
In this study, we tested whether subjects could plan movement as well when the expected gain landscape was asymmetric. We included stimulus configurations similar to those of Trommershäuser et al. (2003a, 2003b) but added two stimulus configurations in which the two penalty regions carried different penalties. Both of these three-region configurations were geometrically symmetric. The two configurations differed with respect to the shape of the expected gain landscape. While one configuration had an expected gain landscape that was nearly symmetric, the expected gain landscape of the other was markedly asymmetric. 
The performance of the subjects with the configuration with the near-symmetric expected gain landscape was impressive. All but one of the six subjects won at least “94 cents on the optimal dollar.” In contrast, in the configuration where the expected gain landscape was markedly asymmetric, subjects did not do as well. Four out of six subjects did shift their MEP away from the target and from the red penalty in roughly the correct direction, but not far enough. The remaining two subjects (CH and JC) shifted their MEP away from the target center in the x-direction (∼2 mm) but not away from the axis of symmetry. Subject CH in particular reported that, in this condition, he “did not know what to do” and therefore he simply tried to aim for the center of the target region whenever it appeared. 
However, we note that four out of the six subjects (all but CH and JC) planned movements, even in the condition with the markedly asymmetric expected gain landscape, that were qualitatively consistent with optimal performance. In condition 2P-B, they displaced their MEPs well away from the line of symmetry (Figure 4B). The MEG point was predicted to fall inside the lesser penalty region, and for three subjects (MF, LB, and KD) the actual MEPs indeed fell inside this region. On the other hand, subjects CH and JC did not shift their MEPs well away from the symmetry axis. 
It has been argued that humans select movement plans that serve to reduce the variability of the resulting movements. People sacrifice speed to increase accuracy as targets are made smaller (Bohan, Longstaff, van Gemmert, Rand, & Stelmach, 2003; Fitts, 1954; Fitts & Peterson, 1964; Meyer, Abrams, Kornblum, Wright, & Smith, 1988; Murata & Iwase, 2001; Plamondon & Alimi, 1997; Schmidt, Zelaznik, Hawkins, Frank, & Quinn, 1979; Smyrnis, Evdokimidis, Constantinidis, & Kastrinakis, 2000). The characteristic “bell-shaped” velocity profile that the eye or finger follows in moving to a target is the profile that minimizes end point variance of the movement (Harris & Wolpert, 1998; Todorov & Jordan, 2002). Finger position variance is minimized in passing an obstacle (Hamilton & Wolpert, 2002; Sabes & Jordan, 1997), and visual information about the position and motion of the hand is combined optimally to minimize motor output variance (Saunders & Knill, 2004). Observers use previously acquired information about target position to reduce variance when the available visual feedback about the position of the fingertip becomes unreliable (Körding & Wolpert, 2004). More generally, Todorov and Jordan (2002) have proposed that the motor system selects movements that minimize task-relevant variance, and it has been demonstrated that subjects change their movement plan to adjust for changes in task-relevant variance (Trommershäuser, Gepshtein, Maloney, Landy, & Banks, 2005). These findings suggest that the motor system generates an estimate of its own motor variability. 
In previous work (Trommershäuser et al., 2003b), we considered the possibility that the subject only gradually “learns” the MEG by attending to the outcomes of each trial. If the subject is “hill climbing” toward the MEG, we would expect to see trends in end points across time. We have previously tested for such trends in our earlier experiments and found none (see, for example, Trommershäuser et al., 2003b, Figure 7). We repeated these analyses on the end point data of all subjects in the experiment reported here and also found no evidence of systematic trends. Whether a subject's end point is near optimal or less so, it does not seem to be learned gradually. 
We asked whether the deviations from optimal movement strategies found in our experiment could be due to an erroneous estimate of the subjects' motor variability. However, the deviations in actual end points from the predicted optimal end points could not be explained by any other estimate of motor variability (Figure 6). Therefore, we suggest that deviations from optimality in configurations with asymmetric expected gain landscape may be due to the increased complexity of the movement planning task. 
If we made the stimuli sufficiently complex by increasing the number of regions, then we would expect that subjects will become suboptimal in performance. That was not at all the goal of this research. Our purpose was to test a specific hypothesis concerning subjects' near optimal performance as reported in earlier published work. The three-circle configurations are similar to those used in Trommershäuser et al. (2003a, 2003b). We increased the complexity of the figures (by assigning distinct penalties to two regions) as little as possible consistent with introducing an asymmetry into the expected gain landscape. 
To conclude, the symmetry of the expected gain landscapes may have been an important part of the good performance reported by Trommershäuser et al. (2003a, 2003b), in the one-penalty replications of Trommershäuser et al. reported here, and in the condition with two different penalty regions having a near-symmetric expected gain landscape. Although we cannot conclude that humans rely on the symmetry–axis heuristic in our task, we cannot reject the hypothesis that such a heuristic was used by the subjects based on the discrepancy between the performance when the optimal end point was far away from the symmetry axis and the near optimal performance when the optimal end point was either very close to the symmetry line or directly fell on the line. Deviations from optimality in configurations with markedly asymmetric expected gain landscapes suggest that we have found a limitation in human movement planning. 
Acknowledgments
SWW, LTM, and MSL were supported by grant EY08266 from the National Institute of Health. JT was funded by the Deutsche Forschungsgemeinschaft (Emmy-Noether Programm), grants TR 528/1-1 and TR 528/1-2. 
Commercial relationships: none. 
Corresponding author: Shih-Wei Wu. 
Email: sww214@nyu.edu. 
Address: 6 Washington Place, 8th Floor, New York, NY 10003. 
Footnotes
Footnotes
 1The terms “loss,” “gain” and “value” are all commonly used in applications of statistical decision theory and usage differs in different fields (Maloney, 2002). In this article, we use the term “gain” to refer to both monetary gains and losses. Losses are gains with a negative sign. We sometimes use the term “penalty” to refer to exclusively negative gains.
Footnotes
 2The covariance matrix in Hotelling's T2 test should be the sum of the 2 × 2 covariance matrix of the mean end point (MEP: estimated from the data) and the covariance of the MEG estimate. Estimating the covariance matrix of the MEG point accurately is extremely computationally intensive. However, we could verify by resampling methods using a small number of samples that the entries in the covariance matrix of MEG are less than 1% of those in the covariance matrix of MEP. We therefore omitted the covariance matrix of the MEG in the Hotelling's test, in effect treating MEG as a constant rather than as an estimate. We also determined that adding values comparable in magnitude to the covariance matrix of MEG to the covariance matrix of MEP would not affect the outcome of any of the Hotelling's tests reported.
References
Bernoulli, D. (1738/1954). Exposition of a new theory on the measurement of risk (L. Sommer, Trans.). Econometrica, 22, 23–36. [CrossRef]
Bohan, M. Longstaff, M. G. van Gemmert, A. W. Rand, M. K. Stelmach, G. E. (2003). Effects of target height and width on 2D pointing movement duration and kinematics. Motor Control, 7, 278–289. [PubMed] [PubMed]
Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–436. [PubMed] [CrossRef] [PubMed]
Efron, B. Tibshirani, R. (1993). An introduction to the bootstrap. New York: Chapman-Hall.
Fitts, P. M. (1954). The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47, 381–391. [PubMed] [CrossRef] [PubMed]
Fitts, P. M. Peterson, J. R. (1964). Information capacity of discrete motor responses. Journal of Experimental Psychology, 67, 103–112. [PubMed] [CrossRef] [PubMed]
Geisler, W. S. (1989). Sequential ideal-observer analysis of visual discriminations. Psychological Review, 96, 267–314. [PubMed] [CrossRef] [PubMed]
Gigerenzer, G. Todd, P. M. (1999). Simple heuristics that make us smart.. New York: Oxford.
Hamilton, A. F. C. Wolpert, D. M. (2002). Controlling the statistics of action: Obstacle avoidance. Journal of Neurophysiology, 87, 2434–2440. [PubMed] [Article] [PubMed]
Harris, C. M. Wolpert, D. M. (1998). Signal-dependent noise determines motor planning. Nature, 394, 780–784. [PubMed] [CrossRef] [PubMed]
Kahneman, D. Slovic, P. Tversky, A. (1982). Judgment under uncertainty: Heuristics and biases. New York: Cambridge University Press.
Körding, K. P. Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427, 244–247. [PubMed] [CrossRef] [PubMed]
Maloney, L. T. Heyer,, D. Mausfeld, R. (2002). Statistical decision theory and biological vision. Perception and the physical world: Psychological and philosophical issues in perception. (pp. 145–189). New York: Wiley.
Maloney, L. T. Trommershäuser, J. Landy, M. S. Gray, W. (in press). Questions without words: A comparison between decision making under risk and movement planning under risk. Integrated models of cognitive systems. New York: Oxford University Press.
Manly, B. F. J. (2005). Multivariate statistical methods: A primer. Boca Raton, LA: Chapman & Hall/CRC (M3).
Meyer, D. E. Abrams, R. A. Kornblum, S. Wright, C. E. Smith, J. E. (1988). Optimality in human motor performance: Ideal control of rapid aimed movements. Psychological Review, 95, 340–370. [PubMed] [CrossRef] [PubMed]
Mood, A. Graybill, F. A. Boes, D. C. (1974). . Introduction to the theory of statistics. –440). New York: McGraw-Hill.
Murata, A. Iwase, H. (2001). Extending Fitts' law to a three-dimensional pointing task. Human Movement Science, 20, 791–805. [PubMed] [CrossRef] [PubMed]
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. [PubMed] [CrossRef] [PubMed]
Plamondon, R. Alimi, A. M. (1997). Speed/accuracy trade-offs in target-directed movements. Behavioral Brain Sciences, 20, 279–303. [PubMed] [PubMed]
Sabes, P. N. Jordan, M. I. (1997). Obstacle avoidance and perturbation sensitivity in motor planning. Journal of Neuroscience, 17, 7119–7128. [PubMed] [Article] [PubMed]
Saunders, J. A. Knill, D. C. (2004). Visual feedback control of hand movements. Journal of Neuroscience, 24, 3223–3234. [PubMed] [Article] [CrossRef] [PubMed]
Schmidt, R. A. Zelaznik, H. Hawkins, B. Frank, J. S. Quinn, J. T. (1979). Motor-output variability: A theory for the accuracy of rapid motor acts. Psychological Review, 47, 415–451. [PubMed] [CrossRef] [PubMed]
Simon, H. A. (1957). Models of man. New York: Wiley.
Smyrnis, N. Evdokimidis, I. Constantinidis, T. S. Kastrinakis, G. (2000). Speed-accuracy trade-off in the performance of pointing movements in different directions in two-dimensional space. Experimental Brain Research, 134, 21–31. [PubMed] [CrossRef] [PubMed]
Todorov, E. Jordan, M. I. (2002). Optimal feedback control as a theory of motor coordination. Nature Neuroscience, 5, 1226–1235. [PubMed] [Article] [CrossRef] [PubMed]
Trommershäuser, J. Gepshtein, S. Maloney, L. T. Landy, M. S. Banks, M. S. (2005). Optimal compensation for changes in task-relevant movement variability. Journal of Neuroscience, 25, 7169–7178. [PubMed] [CrossRef] [PubMed]
Trommershäuser, J. Maloney, L. T. Landy, M. S. (2003a). Statistical decision theory and trade-offs in the control of motor response. Spatial Vision, 16, 255–275. [PubMed] [CrossRef]
Trommershäuser, J. Maloney, L. T. Landy, M. S. (2003b). Statistical decision theory and the selection of rapid, goal-directed movements. Journal of the Optical Society of America A, 20, 1419–1433. [PubMed] [CrossRef]
Figure 1
 
A stimulus configuration and its expected gain landscape. (A) Stimulus configuration from Trommershäuser et al. (2003a). The reward and penalty associated with hitting within each region are shown. (B) The expected gain landscape for the stimulus configuration in panel A for a subject with motor variability σ = 5.75 mm. The scale on the right specifies the expected gain per trial for different MEPs. The MEG point is marked by an orange diamond.
Figure 1
 
A stimulus configuration and its expected gain landscape. (A) Stimulus configuration from Trommershäuser et al. (2003a). The reward and penalty associated with hitting within each region are shown. (B) The expected gain landscape for the stimulus configuration in panel A for a subject with motor variability σ = 5.75 mm. The scale on the right specifies the expected gain per trial for different MEPs. The MEG point is marked by an orange diamond.
Figure 2
 
Asymmetric reward structures. (A) Two-penalty stimulus configuration 2P-A and its reward structure. The configuration was still geometrically symmetric but the penalties associated with the two penalty regions differ. (B) The resulting expected gain landscape and MEG point based on a motor variability σ = 5.75 mm, the same value as was used in preparing Figure 1B. Note that the MEG point is shifted away from the symmetry line, but only slightly. (C) Two-penalty stimulus configuration 2P-B and its reward structure. (D) The resulting expected gain landscape and MEG point based on the same σ used for Figures 1B and 2B. Note that the MEG point is markedly shifted away from the symmetry line and inside the blue penalty region.
Figure 2
 
Asymmetric reward structures. (A) Two-penalty stimulus configuration 2P-A and its reward structure. The configuration was still geometrically symmetric but the penalties associated with the two penalty regions differ. (B) The resulting expected gain landscape and MEG point based on a motor variability σ = 5.75 mm, the same value as was used in preparing Figure 1B. Note that the MEG point is shifted away from the symmetry line, but only slightly. (C) Two-penalty stimulus configuration 2P-B and its reward structure. (D) The resulting expected gain landscape and MEG point based on the same σ used for Figures 1B and 2B. Note that the MEG point is markedly shifted away from the symmetry line and inside the blue penalty region.
Figure 3
 
Stimulus configurations. (A) The stimulus configurations for Session A consisted of the two-penalty configuration (2P-A) and three sub-configurations: the high-penalty area and the reward area (HP-A), the low-penalty region and the reward region (LP-A), and the reward region only (RO). On each trial, the subject was presented with one of these four configurations rotated by 0, 90, 180, or 270 deg chosen randomly. (B) The stimulus configurations for Session B were based on the 2P-B configuration in a similar fashion.
Figure 3
 
Stimulus configurations. (A) The stimulus configurations for Session A consisted of the two-penalty configuration (2P-A) and three sub-configurations: the high-penalty area and the reward area (HP-A), the low-penalty region and the reward region (LP-A), and the reward region only (RO). On each trial, the subject was presented with one of these four configurations rotated by 0, 90, 180, or 270 deg chosen randomly. (B) The stimulus configurations for Session B were based on the 2P-B configuration in a similar fashion.
Figure 4
 
MEPs and MEG predictions. (A) Stimulus configuration 2P-A. The orange diamond indicates the computed MEG point for each subject. It depends upon the subject's motor variability σ, which is shown. Subjects' MEPs are plotted with error bars indicating the 95% confidence interval. They are generally close to the computed MEG point. (B) Stimulus configuration 2P-B. The format is the same as in panel A. For four out of six of the subjects, the MEP is displaced away from the computed MEG points toward the center of the target and/or toward the high-penalty region. The pooled σ values are indicated for each subject.
Figure 4
 
MEPs and MEG predictions. (A) Stimulus configuration 2P-A. The orange diamond indicates the computed MEG point for each subject. It depends upon the subject's motor variability σ, which is shown. Subjects' MEPs are plotted with error bars indicating the 95% confidence interval. They are generally close to the computed MEG point. (B) Stimulus configuration 2P-B. The format is the same as in panel A. For four out of six of the subjects, the MEP is displaced away from the computed MEG points toward the center of the target and/or toward the high-penalty region. The pooled σ values are indicated for each subject.
Figure 5
 
Efficiencies. (A) Two-penalty configurations. The amount of money each subject won is shown divided by the MEG possible for an optimal subject with the same motor variability. The lower contour of the darker shaded region marks the lower limit of a 95% confidence interval for the optimal (MEG) movement planner. The lower contour of the lighter shaded region marks the lower limit with a Bonferroni correction for multiple tests. (B) One-penalty configurations.
Figure 5
 
Efficiencies. (A) Two-penalty configurations. The amount of money each subject won is shown divided by the MEG possible for an optimal subject with the same motor variability. The lower contour of the darker shaded region marks the lower limit of a 95% confidence interval for the optimal (MEG) movement planner. The lower contour of the lighter shaded region marks the lower limit with a Bonferroni correction for multiple tests. (B) One-penalty configurations.
Figure 6
 
The effect of σ on the MEG end point. (A) Configuration 2P-A. (B) Configuration 2P-B. The orange line shows the locus of MEG points for different values of motor variability σ ranging from 1 to 7 mm (this range includes the motor variabilities of all of the subjects in the experiment reported here and also those for subjects reported by (Trommershäuser et al., 2003a, 2003b). The number adjacent to each tick mark indicates its σ value. The MEP for any optimal (MEG) movement planner (with motor variability in that range) falls somewhere along this locus. (C) Configurations HP-A and HP-B. (D) Configuration LP-B.
Figure 6
 
The effect of σ on the MEG end point. (A) Configuration 2P-A. (B) Configuration 2P-B. The orange line shows the locus of MEG points for different values of motor variability σ ranging from 1 to 7 mm (this range includes the motor variabilities of all of the subjects in the experiment reported here and also those for subjects reported by (Trommershäuser et al., 2003a, 2003b). The number adjacent to each tick mark indicates its σ value. The MEP for any optimal (MEG) movement planner (with motor variability in that range) falls somewhere along this locus. (C) Configurations HP-A and HP-B. (D) Configuration LP-B.
© 2006 ARVO
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×