Abstract
In visual search tasks, the time required to find targets (reaction time - RT) is a function of the number of items in the display (set size). Targets can be found efficiently if they can be uniquely defined by the presence one of a limited set of features. Thus, for example, in search for red targets among blue distractors, the slope of the RT x set size function will be close to zero. Other tasks (e.g. search for a letter among various distracting letters) will be inefficient even if the items can be resolved and identified without eye movements. This holds for artificial tasks typically used in laboratory search experiments. What about searches in the real world where the target is not precisely specified (“Find a bottle.”) and where one's goal changes from search to search (Find the bottle, now the fork, now the bread)? A major obstacle to studying such searches in real scenes has been that it is very hard to specify set size (How many objects are in your field of view right now? Does the keyboard constitute one object or many?) We adopted a brute force method, hand-labeling every object in a set of 100 indoor scenes and using the number of labeled items as a conservative estimate of set size. By this method, we placed scenes into set size bins from 20–30 to 80–90 items. On each trial, twelve observers searched for different targets, drawn at random from the set of labeled items. Targets were present on 50% of trials. Slopes of RT x set size functions averaged 4.6 msec/item for target-present, 4.7 for target-absent trials. Search in these scenes seems to be guided very effectively by something other than the usual attributes like color, orientation, etc. We propose that scene-based properties efficiently guide attention.
NIH:NIMH56020, AFOSR, DHS.