Abstract
Facial expressions are essentially dynamic. However, most research on faces has entirely focused on static pictures of faces. While it is clear that neurons with selectivity for dynamic facial expressions are located in the superior temporal sulcus, the underlying neural circuits and computational mechanisms are completely unknown. As a first step towards a systematic electrophysiological study of these mechanisms in monkey cortex we devised a quantitative physiologically plausible model for the processing of dynamic faces.
METHODS: The model combines mechanisms from a hierarchical neural model for the norm-referenced encoding of static faces (Giese & Leopold, Neurocomputing, 2005) and different mechanisms for the integration of information over time. We tested two mechanisms: 1) 'snapshot neurons' that are selective for the temporal order of the stimuli, and 2) a completely novel mechanism that recognizes dynamic facial expressions by the detection of the temporal changes of the activity of neurons, encode faces in a norm-referenced framework. These neurons encode the distance between the actual face shape and the neutral facial expression in face space.
RESULTS: Both models successfully recognize facial expressions of monkeys (e.g. coo call, or threat) from real videos, while they make fundamentally different predictions at the single-cell level. These predictions are straight-forward to test in single-cell studies.
CONCLUSIONS: The proposed mechanisms are compatible with the known electrophysiological data about the encoding of dynamic faces. They result in predictions that are straight-forward to test in the context of single cell recordings: 1) sequence-selective neurons tuned to specific intermediate frames of facial movies, and 2) neurons changing their activity monotonically with the distance between stimuli and the neutral expressions in face space, and its temporal derivative. These predictions will help to structure electrophysiological experiments unraveling the cortical mechanisms of the processing of dynamic faces.
Meeting abstract presented at VSS 2012