September 2018
Volume 18, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2018
Hierarchical Representations of Viewpoint and Illumination in Deep Convolutional Neural Networks Trained for Face Identification
Author Affiliations
  • Matthew Hill
    Behavioral and Brain Sciences, The University of Texas at Dallas
  • Connor Parde
    Behavioral and Brain Sciences, The University of Texas at Dallas
  • Jun-Cheng Chen
    Institute for Advanced Computer Studies, The University of Maryland
  • Carlos Castillo
    Institute for Advanced Computer Studies, The University of Maryland
  • Volker Blanz
    Institute for Vision and Graphics, University of Siegen
  • Alice O'Toole
    Behavioral and Brain Sciences, The University of Texas at Dallas
Journal of Vision September 2018, Vol.18, 353. doi:10.1167/18.10.353
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Matthew Hill, Connor Parde, Jun-Cheng Chen, Carlos Castillo, Volker Blanz, Alice O'Toole; Hierarchical Representations of Viewpoint and Illumination in Deep Convolutional Neural Networks Trained for Face Identification. Journal of Vision 2018;18(10):353. doi: 10.1167/18.10.353.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Deep convolutional neural networks (DCNNs) have defined the state-of-the-art in automatic face identification in recent years, but the nature of the information encoded in the top-level features of these networks is still poorly understood. To probe these deep feature representations, we utilized a face identification DCNN (Chen, Patel, & Chellappa, 2016) trained with 494,414 face images of 10,575 identities. These training images varied widely in illumination, viewpoint, and quality (blur, facial occlusion, etc.). We used this DCNN to process face images rendered from a highly controlled dataset of laser-scanned faces (Troje & Bülthoff, 1996). The images were rendered to vary systematically in viewpoint and illumination for each of 133 faces (65 male). Specifically, each face was rendered from 5 viewpoints (0° [frontal], 20°, 30°, 45°, and 60°), and under two illumination conditions (ambient vs. off-center spotlight). This yielded 10 images per face. A Receiver Operating Characteristic (ROC) curve showed excellent identification performance for the DCNN on the dataset (area under the ROC = 0.997). Next, we used t-distributed Stochastic Neighbor Embedding (t-SNE) to compress the top-level feature map into two dimensions to visualize the effect of viewpoint and illumination in the DCNN similarity space. The t-SNE showed that illumination and viewpoint clustered hierarchically, as follows. The largest grouping in this t-SNE space divided males and females into two large clusters. Within the gender clusters, each image clustered according to its respective identity. Within each identity cluster, the two illumination conditions separated into sub-clusters. Remarkably, within each illumination condition there was a "chain" of systematically varying viewpoints. This hierarchical pattern indicates that­ although the DCNN features were optimized for identification, within-identity photometric variables were well represented in the top-level deep features. These results illustrate how photometric information can co-exist with identity in a representation optimized only for the latter.

Meeting abstract presented at VSS 2018

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×