September 2017
Volume 17, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2017
Face Representations in Deep Convolutional Neural Networks
Author Affiliations
  • Connor Parde
    School of Behavioral and Brain Sciences, The University of Texas at Dallas
  • Carlos Castillo
    Department of Electrical Engineering, University of Maryland, College Park
  • Matthew Hill
    School of Behavioral and Brain Sciences, The University of Texas at Dallas
  • Y. Colon
    School of Behavioral and Brain Sciences, The University of Texas at Dallas
  • Jun-Cheng Chen
    Department of Electrical Engineering, University of Maryland, College Park
  • Swami Sankaranarayanan
    Department of Electrical Engineering, University of Maryland, College Park
  • Alice O'Toole
    School of Behavioral and Brain Sciences, The University of Texas at Dallas
Journal of Vision August 2017, Vol.17, 246. doi:10.1167/17.10.246
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Connor Parde, Carlos Castillo, Matthew Hill, Y. Colon, Jun-Cheng Chen, Swami Sankaranarayanan, Alice O'Toole; Face Representations in Deep Convolutional Neural Networks. Journal of Vision 2017;17(10):246. doi: 10.1167/17.10.246.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Algorithms based on deep convolutional neural networks (DCNNs) have made impressive gains on the problem of recognizing faces across changes in appearance, illumination, and viewpoint. These networks are trained on a very large number of face identities and ultimately develop a highly compact representation of each face at the network's top level. It is generally assumed that these representations capture aspects of facial identity that are invariant across pose, illumination, expression, and appearance. We analyzed the top-level feature space produced by two state-of-the-art DCNNs trained for face identification with >494,000 images of 10,575 individuals (Chen, 2016; Sankaranarayanan, 2016). In one set of experiments, we trained classifiers to predict image-based properties of faces using the networks' top-level feature descriptions as input. Classifiers determined face yaw to within 9.5 degrees and face pitch (frontal versus offset) at 67% correct. Top-level features also predicted whether the input came from a photograph or video frame with 87% accuracy. In a second experiment, we compared top-level feature codes of different views of the same identities to develop an index of feature invariance. Surprisingly, we found that invariant coding was a characteristic of individual identities, rather than individual features - with some identities encoded invariantly whereas others were not. In a third analysis, we used t-Distributed Stochastic Neighbor Embedding to visualize the top-level DCNN feature space for the Janus CS3 dataset (cf. Klare et al., 2015) containing over 69,000 images of 1,894 distinct identities. This visualization indicated that image quality information is retained in the top-level DCNN features, with poor quality images clustering at the center of the space. The representation of photometric details for face images in top-level DCNN features echoes findings of object category-orthogonal information in macaque IT cortex (Hong et al., 2016), reinforcing the claim that coarse codes can effectively represent complex stimulus sets.

Meeting abstract presented at VSS 2017

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×