Abstract
A challenge for the visual system is to go beyond immediately visible patterns to identify the scene processes that generated them. One sees the stripes, one infers the zebra. We asked observers to judge whether each of 100 binary sequences of 20 blue and yellow squares were generated by one of two generators. Each sequence was equally likely to be the outcome of a random generator with probability of repetition 0.5 ("a fair coin") or a two-state Markov generator, with a probability of repetition 0.9, tending to generate long, repeating sequences of yellow or blue. In addition for the sequences generated by the Markov generator, each square in each sequence could be independently disrupted (flipped from blue to yellow or vice versa). There were three experimental conditions with probabilities of disruption 0.1, 0.2, and 0.3, respectively. Each observer received extensive training with both generators and disruption. We first compared human performance to that of an ideal model derived from Bayesian decision theory (BDT). Ratios between the accuracy of observers and that of the BDT model were 0.83, 0.97, and 0.95 in the three conditions. Human observers were markedly suboptimal but (surprisingly) the relative advantage of the Bayesian model decreased with increasing disruption. We then compared human performance to that of several different heuristic feature models (HFMs). Each HFM based its judgment on a specific visual feature of a sequence such as the length of the longest repeating subsequence or the number of subsequences. No HFM performed as well as the Bayesian but some feature models outperformed the median human observer. Two HFMs ("length of longest subsequence" and "total number of repetition") matched the pattern of responses for roughly half the observers. Human performance is better captured by simple heuristic feature models than a model based on Bayesian decision theory.