Abstract
PURPOSE. Information in an image can be represented in separate spatial-frequency bands, or spatial scales. If non-redundant information is available in different spatial scales, access to multiple scales should provide better recognition performance than access to a single scale. An image can also be divided into different spatial regions. If there is a capacity constraint on the amount of visual information that can be processed at one time, there will be a trade off in the resolution of the spatial scale and the size of the region analyzed. Such a capacity constraint can be simulated by using a restricted “window” which varies in size according to the spatial scale. We asked how effectively people would use multi-scale information when they can control the spatial scale and spatial location of such a window in an object recognition task.
METHOD. Three normally sighted subjects recognized grayscale images on a computer screen. The stimulus set consisted of images of four 3D objects (wedge, pyramid, cylinder and cone) in eight different views. Spatial scales used were 2, 4, 8 and 16 cycles/object. The square window had a linear size of about half a cycle at the scale used. Subjects used a mouse to control the spatial location of the window in discrete steps and four different keys on the keyboard to select the spatial scale. The subjects were instructed to maximize accuracy, not speed.
RESULTS. Averaged percent correct across subjects for the multi-scale condition was 54.17%. Averaged percent correct for the 2-, 4-, 8- and 16-single conditions were 42.71%, 55.21%, 64.58% and 62.50% respectively.
CONCLUSIONS. Recognition accuracy in the multi-scale condition was poorer than that in some of the single-scale conditions. Our findings indicate that human observers did not benefit from access to information in different spatial scales.
Supported by NIH grant EY02857