Abstract
Human observers are amazingly adept at interpreting cluttered natural scenes, whether these scenes are presented as static photographs or dynamic movies. For example, observers have little difficulty in segmenting a scene into salient objects (e.g., a pedestrian walking through a park). Of course, the environment is dynamic—therefore, we asked whether there is an advantage for dynamic scenes relative to static ones. To address this question, we devised a dynamic composite stimulus in which two separate frame sequences were “blended” into a single stimulus by averaging the luminance of corresponding frames of the separate sequences. By varying the relative weight (alpha) of the two original sequences, we can make one sequence more or less visible in the composite stimulus. Here, we blended frame sequences of pedestrians walking in a park with various machines in action. Observers were briefly presented with two composite stimuli and they judged whether a human target was present or absent in one of them. We compared the 50% alpha threshold for detection across three conditions: (1) coherent dynamic stimuli, (2) static stimuli, and (3) scrambled dynamic stimuli in which we randomized the frame order of the sequence. Overall, we found that thresholds were lower for dynamic than for static stimuli. That is, when dynamic information regarding the human target was available, observers required less static cues (i.e., lower alpha) to detect that target. We found no difference between coherent and scrambled dynamic composites, suggesting that the critical component is increased availability of information over time.