Abstract
Primates can recognize faces easily and rapidly despite changes in distance, illumination, head orientation, and even partial occlusion. To determine the neural and computational mechanisms of this ability, a model system devoted to the processing of faces has proven particularly useful. This network in the macaque brain consists of multiple interconnected face areas, each highly selective for faces, but each tuned to faces in qualitatively and quantitatively different ways. Some recent computational models of face-processing have succeeded in describing the main functional properties of this network, the tuning to faces (versus objects) and the tuning to head orientation. However, it remains unclear how tuning to head orientation is generated. Here we test the hypothesis that selectivity for local facial features might be the underlying cause. We presented monkey faces and non-face objects isolated on a plain background to Alexnet, a feedforward hierarchical structure, trained with stochastic gradient descent on the ImageNet database. We then characterized feature selectivity in face-selective model units by determining the area of the input image affecting their response and we measured the unit’s first-order receptive field (RF). We found that there are highly face-selective and head-orientation tuned units from the first convolutional layer onward. Our results suggested that head-orientation tuning is the direct consequence of local feature tuning. In early layers, eye-related features in frontal views and nose-related features in profile views were particularly abundant among critical facial features. Coverage of eyes and nose parts increased along the hierarchy. Thus we found that in Alexnet, local feature tuning is a sufficient mechanism for generating head-orientation tuning. We will discuss the relationship of this mechanism in a deep convolutional network with the properties found in the macaque face-processing system.