Abstract
A number of important theories of visual recognition assume that objects are represented on the basis of parts and their explicitly defined relations (e.g. Biederman, 1987). Much of the evidence pertaining to the encoding of relations between features (as opposed to encoding of the features themselves) comes from so-called ‘configural effects’, such as the advantage in recognizing one part of a face when other parts are present (Tanaka & Sengko, 1997). However, many of these findings might be explained by invoking ‘larger’ features that incorporate multiple smaller features, e.g. using a single ‘eye-and-nose’ feature, rather than separate ‘eye’ and ‘nose’ features in a particular spatial relationship. The current research aims to disassociate the roles of features and their relations: subjects viewed patterns composed of multiple, distinct polygonal shapes in which the spatial relations between spatially non-contiguous features were controlled, so that specific pairs of features (‘base-pairs) always appeared together, and in the same spatial relation to one another, across multiple patterns. Across three experiments, we tested whether subjects had learned the statistics of the patterns in terms of the joint and conditional probability of the positions of base-pair features. Our results showed that subjects could learn the statistical properties of non-contiguous features while discounting the properties of features located between them. These results are inconsistent with a plausible ‘larger feature’ hypothesis, which would necessarily include the spatially intermediate features, and provide direct support for the explicit encoding of relations between features in unsupervised learning.
References:
I. Biederman. (1987). Recognition by components: A theory of human image understanding. Psychological Review, 94, 115-147.
J. W. Tanaka and J. A. Sengco. (1997). Features and their configuration in face recognition. Memory & Cognition, 25, 583-592.
Both authors funded by NGA Award #HM1582-04-C-0051.