Abstract
A human infant can tell apart faces from non-faces when just a few weeks old, and possibly even sooner. This poses the question of how this ability arises. A prominent idea that attempts to account for the very early onset of this skill is that of an innately specified rudimentary face ‘template’ that predisposes infants to look at faces. The counterpoint to this proposal, a purely experiential account of early face preferences, has not been adequately tested. Is it possible that an infant's early visual input might be sufficient to permit rapid acquisition of a basic face concept? We address this question by testing the computational feasibility of learning a face-concept from inputs that correspond to a newborn's visual experience.
We commenced this research when one of us (PS) became a father recently. Using a headband-mounted miniature camera (with optics modified to approximately mimic the infant's reduced acuity), we were able to record video-clips from the baby's point of view. They revealed some interesting aspects of the baby's visual experience with regard to faces. Given that the newborn is held by the parents much of the time while awake, faces end up being the most prevalent object in the inputs. Faces are also the most salient by virtue of their ability to move and produce sound; they have few competitors since the baby spends the majority of the first few weeks indoors, in an environment where people are the primary animate entities. Furthermore, the infant's poor acuity significantly degrades images of objects beyond a few feet, adding to the close-up faces' salience. We find that these aspects of the baby's experience make it feasible for an unsupervised learning system to rapidly acquire a rudimentary face concept, even without any innately specified face-templates.
The John Merck Fund and the Simons Foundation