Due to the poor image quality provided by conventional security systems, the identification of a person captured on video must largely rely on information carried by low spatial frequencies. However, matching stimuli for such video images are often made from high quality photographs with facial information carried by both low and high spatial frequencies. We examined whether a large discrepancy between the qualities of video image and photographic image affect identification performance, given prior evidence that face recognition is strongly influenced by spatial filtering and by the amount of overlapping spatial frequency between learned and test images. In two conditions, the quality of video clips and the photographic image was made to be either discrepant or equalized. In the equal-quality condition, photographs were reduced to the level of video image. A standard recognition task and a simultaneous matching task were used. Each person on the video was either presented as a short movie or as four static snapshots taken from the same clip.
The results of the recognition task with or without dynamic information showed no difference between discrepant and equal-quality conditions. However, in a matching task containing only dynamic information, the discrepant-quality condition produced a slight (6%) but significant advantage over the equal-quality condition. Motion and static images in both recognition and matching tasks produced comparable results.
Our results show that face recognition is tolerant to a large discrepancy between image qualities of matching stimuli. This finding is surprising given that recognition of unfamiliar faces is known to rely on image similarities between learned and test stimuli. Our study shows that image similarity in terms of resolution and spatial frequency may interact with other similarity parameters such as lighting, pose, and distinct external features of faces.
Supported by NSERC and CIHR.