September 2021
Volume 21, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2021
Biologically plausible illumination-invariant face representations
Author Affiliations & Notes
  • Dylan D. Doblar
    Massachusetts Institute of Technology
  • Katherine M. Collins
    Massachusetts Institute of Technology
  • Bernhard Egger
    Massachusetts Institute of Technology
  • Tomaso Poggio
    Massachusetts Institute of Technology
  • Footnotes
    Acknowledgements  We thank Qianli Liao for his advice on how to implement the templates-and-signatures model. This work was funded in part by the Center for Brains, Minds and Machines (CBMM), NSF STC award CCF-1231216. B. Egger is supported by a PostDoc Mobility Grant, Swiss National Science Foundation P400P2 191110.
Journal of Vision September 2021, Vol.21, 2232. doi:https://doi.org/10.1167/jov.21.9.2232
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Dylan D. Doblar, Katherine M. Collins, Bernhard Egger, Tomaso Poggio; Biologically plausible illumination-invariant face representations. Journal of Vision 2021;21(9):2232. https://doi.org/10.1167/jov.21.9.2232.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Humans possess a remarkable ability to identify objects — and faces in particular — under highly variable conditions. The invariance hypothesis (Leibo et al., 2015) posits that the goal of the human ventral visual system is to learn representations that are invariant to identity-preserving transformations. One computational approach to learn such representations is the templates-and-signatures model proposed by Liao et al. (2013). The model measures similarities between template images and unseen query images to produce discriminative and invariant signatures. From a small number of template examples, the model can learn invariance to group transformations (e.g., scaling, translation, and in-plane rotation) for new object classes, whereas object-class-specific templates are required for non-group transformations (e.g., pose). Here, we probe the capacity of this approach to handle variation in illumination — a complex set of non-group transformations — on a face verification task. A 3D morphable face model is used to generate synthetic datasets under both natural and extreme illumination. We benchmark the templates-and-signatures approach against VGG16, a convolutional neural network (CNN), and compare the effects of a generic object-class versus domain-specific learned prior by pre-training VGG16 either on ImageNet or human faces. We find that under natural illumination settings, the templates-and-signatures model effectively solves the face verification task, outperforming both CNN variants. Additionally, the templates-and-signatures model’s learned representations are impressively invariant to extreme variations in illumination and generalize best when using natural illumination templates. These invariances hold even with tens of training examples, which is particularly striking behavior relative to the CNNs that have been pre-trained on millions of images. Coupled with its simplicity and its implications for a biologically plausible sequence of class-specific developmental periods for learning invariances, the model’s ability to generalize to out-of-distribution illumination settings from few examples lends credence to a templates-and-signatures account of feed-forward object recognition.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×