Abstract
Many sequential sampling models suggest decisions rely on the accumulation of evidence over time until reaching a particular threshold. These models can often account for variations of speed and accuracy in perceptual tasks by manipulation of this threshold. It has been hypothesized that the threshold maximizes an implicit reward function incorporates both the speed and accuracy of the response (Gold & Shadlen, 2003). This approach has produced a family of models that can describe a variety of behaviors in two-alternative forced choice (TAFC) tasks (Bogacz, et al., 2006).
We present a model of optimal sequential perceptual decision-making in a task that modifies the traditional TAFC by adding a supplementary option of acquiring additional information/sample at a cost (e.g., time). In the task, the observer receives a sample from two overlapping distributions. The observer can either use their estimation using this sample (and previous samples) to declare which distribution is the sampled distribution, or they can choose to receive another sample. A reward structure specifies the costs for correct and incorrect answers along with the cost for receiving an additional sample. Thus, the task is to weigh the current evidence against the benefit of acquiring an additional sample. The model adapts the drift-diffusion model (Ratcliff & Rouder, 1998) (Palmer, Huk, & Shadlen, 2005) for sequential decisions using a partially observable Markov decision process. The model provides a framework for evaluating the actual cost structures used by humans in a perceptual judgment task along with understanding the decision maker's sensitivity to these different reward structures. In addition to discussing these issues, the model provides a mechanism for evaluating the effects of imperfect integration (memory limitations), variable signal strengths, and variations in the reward structure for human and optimal behavior.
Project supported by NIH EY016089, AFOSR FA9550 AFOSR MURI FA09550-05-1-0321