Recent advances in machine learning might provide a means to constrain image manipulations to the domain of natural images. Here, we focus on a class of very powerful generative image models, known as generative adversarial nets (GANs; Goodfellow et al.,
2014; Radford, Luke, & Chintala,
2016; Arjovsky, Chintala, & Bottou,
2017; Gulrajani, Ahmed, Arjovsky, Dumoulin, & Courville,
2017; Miyato, Kataoka, Koyama, & Yoshida,
2018; Hjelm et al.,
2018). GANs learn a mapping, called the
generator, from an isotropic Gaussian distribution to the space of images. One defining feature of GANs is the use of an auxiliary classification function, often called the
critic, to judge how good the generator mapping is. Specifically, the critic attempts to predict if a given image has been generated by mapping isotropic Gaussian noise through the generator, or if the image is an instance from the training database. Generator and critic are trained in alternation, where the generator is trained to increase the errors of the critic and the critic is trained to decrease its own error (for example, see Goodfellow et al.,
2014, for details). In general, generator and critic can be any possible transformation, but they are typically implemented as artificial neural networks with multiple hidden layers (Goodfellow et al.,
2014; Radford et al.,
2016). Although never studied quantitatively, images generated from GANs look quite similar to natural images and manipulations in a GAN's latent space and seem to correspond in a meaningful way to perceptual experience. For example, Radford et al. (
2016) start with a picture of a smiling woman, subtract the average latent representation of a neutral woman's face and add a neutral man's face to arrive at a picture of a smiling man. Similarly, Zhu, Krähenbühl, Shechtman, and Efros (
2016) illustrate that projecting perceptually meaningful constraints back to a GAN's latent space allows creation of random images with specified features (e.g., edges or colored patches) in the specified locations. Together, these experiences suggest that GANs recover a reasonably good approximation to the manifold of natural images.