Abstract
Deep convolutional neural networks (DCNNs) have defined the state-of-the-art in automatic face identification in recent years, but the nature of the information encoded in the top-level features of these networks is still poorly understood. To probe these deep feature representations, we utilized a face identification DCNN (Chen, Patel, & Chellappa, 2016) trained with 494,414 face images of 10,575 identities. These training images varied widely in illumination, viewpoint, and quality (blur, facial occlusion, etc.). We used this DCNN to process face images rendered from a highly controlled dataset of laser-scanned faces (Troje & Bülthoff, 1996). The images were rendered to vary systematically in viewpoint and illumination for each of 133 faces (65 male). Specifically, each face was rendered from 5 viewpoints (0° [frontal], 20°, 30°, 45°, and 60°), and under two illumination conditions (ambient vs. off-center spotlight). This yielded 10 images per face. A Receiver Operating Characteristic (ROC) curve showed excellent identification performance for the DCNN on the dataset (area under the ROC = 0.997). Next, we used t-distributed Stochastic Neighbor Embedding (t-SNE) to compress the top-level feature map into two dimensions to visualize the effect of viewpoint and illumination in the DCNN similarity space. The t-SNE showed that illumination and viewpoint clustered hierarchically, as follows. The largest grouping in this t-SNE space divided males and females into two large clusters. Within the gender clusters, each image clustered according to its respective identity. Within each identity cluster, the two illumination conditions separated into sub-clusters. Remarkably, within each illumination condition there was a "chain" of systematically varying viewpoints. This hierarchical pattern indicates that although the DCNN features were optimized for identification, within-identity photometric variables were well represented in the top-level deep features. These results illustrate how photometric information can co-exist with identity in a representation optimized only for the latter.
Meeting abstract presented at VSS 2018