Abstract
Some psychological models for face recognition assume that faces are encoded as vectors in face spaces relative to an average face, or face prototype [T Valentine, Q J Exp Psychol A, 43, 161 (1991)]. So far it has been largely unclear how such a prototype-referenced encoding can be realized at a neural level. Recent electrophysiological data supports the relevance of such encoding in monkey visual cortex. Neurons in area IT, after training with human faces, show monotonic tuning with respect to the caricature level of face stimuli [D Leopold et al., Soc. of Neurosci., Poster 590.7 (2003)]. A neural model is presented that accounts for these electrophysiological results. The model consists of a hierarchy of layers with physiologically plausible neural feature detectors. The complexity of the extracted features increases along the hierarchy. Neurons on the highest level encode example views of faces. The tuning of these neurons is determined by the difference between the feature vector representing the test face, and an average feature vector that is computed from the previous history of stimulation. The neurons are tuned monotonically with respect to the length of the difference vector, and show angular tuning with respect to its direction in feature space. The model was tested with gray-level images generated with a morphable 3D face model [V Blanz, T Vetter, SIGGRAPH '99, 187–194 (1999)], replicating the stimulus set from the electrophysiological study. We conclude that prototype-referenced encoding, compared with the encoding in shape spaces with absolute coordinates, increases coding efficiency by optimally exploiting the available neural hardware.
Supported by the Deutsche Volkswagenstiftung, Hermann and Lilly Schilling Foundation, and the Max Planck Society.