Abstract
Most visual experience consists of dynamic natural scenes, but few tools are available for directly assessing the information that can be obtained from such experience. It would be valuable to have a measure sensitive to the richness of natural viewing as well as viewing of movies and television, and capable of registering failures of higher-order visual function such as object recognition and face perception. We have developed a method for evaluating perception of dynamic natural scenes, where participants supply a free, natural-language description of the content of a video clip, one of 200 video clips of 30 s duration drawn from dramatic and documentary films, and these descriptions are scored against a normative database. One normative database was collected using Amazon.com’s Mechanical Turk (4000 responses, 99 participants), while another was collected locally (2400 responses, 60 participants) with recruitment stratified by age, including a 70y+ group. Several scoring algorithms derived from computational linguistics were evaluated, based on their ability to match descriptions to their corresponding clip. The best algorithm, a simple average of words shared with the normative descriptions, correctly matched 95% of the Mechanical Turk and 75% of the in-lab descriptions. The measure was further evaluated by showing Mechanical Turk participants (N = 92) clips which had been degraded by Gaussian blur, and by showing in-lab participants (N = 15) unmodified clips viewed through varying levels of optical defocus. In both conditions, the average free description score decreased as viewing conditions were degraded. Therefore, this measure can detect differences in information acquisition. The method could easily be adapted to evaluate specific hypotheses about sensory acquisition of high-level information.
Meeting abstract presented at VSS 2013