Open Access
Article  |   April 2019
From individual features to full faces: Combining aspects of face information
Author Affiliations
  • Andrew J. Logan
    School of Optometry and Vision Science, University of Bradford, Bradford, UK
    Department of Vision Sciences, Glasgow Caledonian University, Glasgow, UK
    andrew.logan@gcu.ac.uk
  • Gael E. Gordon
    Department of Vision Sciences, Glasgow Caledonian University, Glasgow, UK
    g.gordon@gcu.ac.uk
  • Gunter Loffler
    Department of Vision Sciences, Glasgow Caledonian University, Glasgow, UK
    g.loffler@gcu.ac.uk
Journal of Vision April 2019, Vol.19, 23. doi:10.1167/19.4.23
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Andrew J. Logan, Gael E. Gordon, Gunter Loffler; From individual features to full faces: Combining aspects of face information. Journal of Vision 2019;19(4):23. doi: 10.1167/19.4.23.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

We investigated how information from face features is combined by comparing sensitivity to individual features with that for external (head shape, hairline) and internal (nose, mouth, eyes, eyebrows) feature compounds. Discrimination thresholds were measured for synthetic faces under the following conditions: (a) full-faces; (b) individual features (e.g., nose); and (c) feature compounds (either external or internal). Individual features and feature compounds were presented both in isolation and embedded within a fixed, task irrelevant face context. Relative to the full-face baseline, threshold elevations for the internal feature compound (2.41x) were comparable to those for the most sensitive individual feature (nose = 2.12x). External features demonstrated the same pattern. A model that incorporated all available feature information within a single channel in an efficient way overestimated sensitivity to feature compounds. Embedding individual features within a task-irrelevant context reduced discrimination sensitivity, relative to isolated presentation. Sensitivity to feature compounds, however, was unaffected by embedding. A loss of sensitivity when embedding features within a fixed-face context is consistent with holistic processing, which limits access to information about individual features. However, holistic combination of information across face features is not efficient: Sensitivity to feature compounds is no better than sensitivity to the best individual feature. No effect of embedding internal feature compounds within task-irrelevant external face features (or vice versa) suggests that external and internal features are processed independently.

Introduction
Faces are complex, multidimensional stimuli that provide a wealth of information. In an attempt to break down the vast amount of information, faces may be conceptualized as a combination of individual features (e.g., head shape, hairline, nose, mouth, eyes, and eyebrows; Figure 1a). These components are often broadly categorized as either external features (head shape and hairline) or internal features (nose, mouth, eyes, and eyebrows; middle tier of Figure 1b), a categorization that has received empirical support (Axelrod & Yovel, 2010; Betts & Wilson, 2010; Logan, Gordon, & Loffler, 2017; Nichols, Betts, & Wilson, 2010). 
Figure 1
 
Schematic face feature hierarchy. The two hypothetical models represent alternatives ways in which information from individual face features may be combined into a full face representation. (a) A single stage model: Information from each individual face feature is directly combined into a full face representation. From left to right: Head shape, hairline, external features (head shape and hairline), nose, mouth, eyes, eyebrows, and internal features (nose, mouth, eyes, and eyebrows). The combination of information is represented by the summation sign. (b) A two-stage model: First, information from individual internal features (nose, mouth, eyes, and eyebrows) is integrated to form an internal feature compound. Similarly, information from the head shape and hairline is integrated into an external feature compound. Second, the external and internal feature compounds combine to form a full face.
Figure 1
 
Schematic face feature hierarchy. The two hypothetical models represent alternatives ways in which information from individual face features may be combined into a full face representation. (a) A single stage model: Information from each individual face feature is directly combined into a full face representation. From left to right: Head shape, hairline, external features (head shape and hairline), nose, mouth, eyes, eyebrows, and internal features (nose, mouth, eyes, and eyebrows). The combination of information is represented by the summation sign. (b) A two-stage model: First, information from individual internal features (nose, mouth, eyes, and eyebrows) is integrated to form an internal feature compound. Similarly, information from the head shape and hairline is integrated into an external feature compound. Second, the external and internal feature compounds combine to form a full face.
Feature integration
In previous work, we used synthetic faces to quantify sensitivity for a range of individual face features (Logan et al., 2017). Each synthetic face can be precisely morphed to differ from another by a specific amount on a scale that matches perceptual sensitivity (Wilson, Loffler, & Wilkinson, 2002). We used a match-to-sample discrimination task to measure discrimination thresholds for different features of synthetic faces. The discrimination threshold represents the minimum geometrical difference required between synthetic faces for accurate discrimination. Our results showed that sensitivity was highest for the head shape and hairline, and significantly poorer for any of the internal features (nose, mouth, eyes, and eyebrows). Sensitivity was between 2 and 4 times higher for external relative to internal features. This result is in agreement with previous reports that have found evidence of an external feature advantage for unfamiliar faces (Bruce et al., 1999; Davies, Ellis, & Shepherd, 1977; Fraser, Craig, & Parker, 1990; Haig, 1986; Nachson & Shechory, 2002; Veres-Injac & Persike, 2009). 
Most previous studies, however, have compared sensitivity to external (head shape + hairline) and internal (nose + mouth + eyes + eyebrows) feature compounds (middle tier in Figure 1b), rather than individual, component features (bottom tier in Figure 1a and b). In the present study, we aimed to investigate how the visual system combines information from individual features. To this end, we investigated the extent to which information is combined (symbolized by the summation signs in Figure 1). In order to investigate face feature integration, we compared discrimination thresholds for full faces with those for individual features, as well as with those for external (head shape and hairline) and internal (nose, mouth, eyes, and eyebrows) feature compounds. 
Previous research has provided some insight into the combination of information from individual face features (i.e., the nature of signal integration symbolized by the summation sign in Figure 1). Utilizing an approach that measured the minimum contrast required to discriminate between internal face features, previous reports have found that thresholds for internal feature compounds were significantly lower than those for component features (Gold et al., 2014; Gold, Mundy, & Tjan, 2012; Shen & Palmeri, 2015). Shen and Palmeri (2015) reported that sensitivity to the feature compounds, which included variations in feature position, was significantly greater than that predicted by sensitivity to each individual feature presented in isolation. The authors interpreted this finding as evidence of a full-face advantage that they attributed to synergistic integration of information across multiple face features. On the other hand, Gold et al. (2012) found that although an internal feature compound did improve sensitivity, the improvement was consistent with what would be expected from an observer who simply has access to more information. The improvement was too small to support the existence of special, efficient mechanisms that integrate information from different face features. Hence, it remains unclear to what extent information from individual features is combined. 
To investigate the extent of information integration, we compared discrimination thresholds for individual features with those for feature compounds. If face processing mechanisms synergistically combine information from multiple individual features, sensitivity to groups of features (i.e., feature compounds) should be substantially greater than that measured for any individual feature (e.g., Macmillan & Creelman, 2005). If observers utilize information from a range of features without specialized integration processes, the improvement would be small (Gold et al., 2012, Macmillan & Creelman, 2005). Finally, observers may base their decision on the feature to which they are most sensitive, ignoring other available information (a decision separability rule; Macmillan & Creelman, 2005). In this case, sensitivity to the feature compound would be expected to be equivalent to that for the individual feature to which sensitivity is greatest. 
Holistic processing
Faces are often considered a special class of visual object that are processed in a different way from other objects (McKone, Kanwisher, & Duchaine, 2007). One of the distinguishing features of faces is the extent to which they engage holistic processing—the combination of individual features into a singular, interdependent representation (Maurer, Le Grand, & Mondloch, 2002; Rossion, 2008). 
We showed previously that the ability to discriminate face features is impaired when these features are presented within a fixed, task-irrelevant face context. (Logan et al., 2017). We interpreted this result as evidence of holistic processing: The integration of information across a face image impairs the subsequent extraction of information about an individual feature. This finding is consistent with the composite face effect, a classic demonstration of holistic processing: Combining the top half of one face with the bottom half of another impairs sensitivity to the component identities (Mondloch, Pathman, Maurer, Le Grand, & de Schonen, 2007; Young, Hellawell, & Hay, 1987). 
In the present study, we aimed to investigate alternative ways in which face information might be combined into holistic representations. Specifically, we aimed to determine whether the information from the external and internal features is combined into a singular representation (one-stage process; Figure 1a), or processed independently (two-stage process; Figure 1b). The latter includes an intermediate representation of external and internal feature compounds. 
On one hand, fMRI studies have found that the blood oxygen-level-dependent (BOLD) signal recorded from the fusiform face area (FFA) is sensitive both to changes made to the external and to the internal features of full faces (Andrews, Davies-Thompson, Kingstone, & Young, 2010; Axelrod & Yovel, 2010), consistent with a one-stage process. On the other hand, it has been reported that full faces, external features, and internal features are represented by independent populations of neurons within occipito-temporal cortex (Betts & Wilson, 2010; Nichols et al., 2010). Similarly, single cell recordings from area IT of the macaque monkey have identified neurons, which demonstrates selectivity for individual external (e.g., hair) and internal (e.g., eyes) features (Freiwald, Tsao, & Livingstone, 2009). These results suggest a two-stage process. 
To investigate whether information from external and internal features is combined into a single-stage, holistic representation, we compared sensitivity to feature compounds presented in isolation with that for the same feature compounds embedded within an otherwise fixed-face context (i.e., internal feature compounds presented within generic external features, and vice versa). Identical task-relevant information is available in both the isolated and embedded conditions. The embedded condition, however, includes additional, task-irrelevant information. Two distinct predictions can be made about the effect of embedding. First, if embedding has no effect on discrimination sensitivity, this would support the proposal that external and internal features are processed independently (Figure 1b). If, on the other hand, embedding features within a fixed-face context reduces sensitivity, relative to presentation in isolation, this would indicate that information from external and internal features is automatically combined (Figure 1a). The latter result would suggest that holistic processing impedes the extraction of information about either the external or internal features from full faces. 
Methods
Synthetic faces
This study used synthetic faces. Most face studies have utilized photographs or digitally manipulated face images. The complexity of these stimuli can make it difficult to directly relate changes in sensitivity to specific aspects of face information. Synthetic faces (Wilson et al., 2002) are simplified stimuli that capture the major geometrical information (head shape, hairline, feature size, and position) of a face photograph. These face images have the advantage that they can be manipulated in a controlled and precise way, independent of face identity. 
To create synthetic faces, the salient geometric face information was digitized from grayscale face photographs with neutral expressions (Figure 2, top; Wilson et al., 2002). The two-dimensional synthetic face images contain a minimal amount of information that has been shown to be sufficient to allow accurate identification (Wilson et al., 2002). For example, while color (e.g., skin, hair, and eye) and texture are present in real faces, faces can be identified without this information. As a result, the synthetic faces include neither color nor texture. 
Figure 2
 
Synthetic faces. Top: (a) Grayscale photograph superimposed with a polar coordinate grid centered on the bridge of the nose. The head shape was measured at 16 locations around the external contour, angularly positioned at equal intervals of 22.5° (outermost small white circles), nine points in the upper half of the face captured hairline information. Head shape and hairline were defined by radial frequency (RF) contours, the parameters of which were derived by the positions of these points. The positions and shapes of the internal face features were defined by 14 additional measurements. The position of all features was idiosyncratic, as derived from the photograph. The shapes of the eyes and eyebrows were generic; those of the mouth and nose were individualized. In sum, each synthetic face is defined by 37 parameters and represented by a 37-dimensional vector (see Wilson et al., 2002 for further details). (b) Photograph filtered with a 2.0 octave bandwidth difference of Gaussian filter with peak spatial frequency of 10 c/face width. (c) Corresponding synthetic face. Bottom: Synthetic faces were adjusted by manipulating their distinctiveness, i.e., by how much they differ from the mean face (left). Increasing face distinctiveness results in individual faces becoming progressively more dissimilar (from middle to right) to the mean face. Distinctiveness is expressed as a percentage of mean head radius and quantifies the total geometric variation between the specified face and the mean face. Typical observers can discriminate a complete face from the mean at about 5% distinctiveness (Wilson et al., 2002).
Figure 2
 
Synthetic faces. Top: (a) Grayscale photograph superimposed with a polar coordinate grid centered on the bridge of the nose. The head shape was measured at 16 locations around the external contour, angularly positioned at equal intervals of 22.5° (outermost small white circles), nine points in the upper half of the face captured hairline information. Head shape and hairline were defined by radial frequency (RF) contours, the parameters of which were derived by the positions of these points. The positions and shapes of the internal face features were defined by 14 additional measurements. The position of all features was idiosyncratic, as derived from the photograph. The shapes of the eyes and eyebrows were generic; those of the mouth and nose were individualized. In sum, each synthetic face is defined by 37 parameters and represented by a 37-dimensional vector (see Wilson et al., 2002 for further details). (b) Photograph filtered with a 2.0 octave bandwidth difference of Gaussian filter with peak spatial frequency of 10 c/face width. (c) Corresponding synthetic face. Bottom: Synthetic faces were adjusted by manipulating their distinctiveness, i.e., by how much they differ from the mean face (left). Increasing face distinctiveness results in individual faces becoming progressively more dissimilar (from middle to right) to the mean face. Distinctiveness is expressed as a percentage of mean head radius and quantifies the total geometric variation between the specified face and the mean face. Typical observers can discriminate a complete face from the mean at about 5% distinctiveness (Wilson et al., 2002).
All internal features carried positional information, relative to the center of the face and the other features. This information is therefore only available within a face context. The mouth and nose also carried shape information. Mouth and nose shapes were produced by altering generic feature templates in terms of length and width based on individual face measurements from the original face photographs. Eyes and eyebrows were generic in shape but provided additional positional information that was independent of the other features because they were presented in pairs. Thus, each of the internal features (mouth, nose, eyes, eyebrows) carried positional information and one additional piece of information that was available without a face context (i.e., when these features were presented in isolation).The face images were band-pass filtered at the spatial frequency that has been reported to be optimal for face identification (10 cycles/face width, circular difference of Gaussian filter with a bandwidth of 2.0 octaves; Näsänen, 1999). While the optimal spatial frequency may be task-dependent, the resulting faces accentuate geometric information in the most important frequency band while omitting cues such as hair and skin texture, skin color, and wrinkles. 
All face measurements (i.e., the 37-dimensional vector representing each face) were normalized by the mean head size of the respective gender, resulting in faces that differed in terms of individual features (e.g., head shape and eye position) but not in overall size. A mean face was produced by averaging each of the 37 dimensions of all synthetic faces of the same gender. Within this framework, synthetic faces can be manipulated to have a defined difference from the mean face (Figure 2, bottom). This geometric difference quantifies the total difference of a face from the mean (i.e., its distinctiveness), expressed as a percentage of the mean head radius. It has been shown that this metric captures discrimination sensitivity independent of face identity (Wilson et al., 2002). 
In this study, synthetic faces from four different Caucasian male individuals were used. At the test distance of 1.20 m, each face subtended 5.5° of visual angle in height. 
Observers
One author (AJL) and three naive observers (mean age = 22.5 years old, range = 19–26) completed the experiments. All four participants (one male) were in good health with normal or corrected-to-normal vision (visual acuity logMAR 0.00 or better, no visual abnormalities). No reimbursement was offered for participation. Participants gave informed consent in accordance with the Declaration of Helsinki, as approved by the Human Subjects Ethics Committee of Glasgow Caledonian University. 
Apparatus
All trials were completed under binocular viewing, under an ambient illumination of 75 cd/m2. Observers were seated at 1.20 m from a computer monitor. Accurate viewing distance was maintained with a chin and forehead rest. Stimuli were created in MATLAB (www.mathworks.com; MathWorks, Natick, MA) and presented, using routines from the Psychtoolbox extension (Brainard, 1997; Pelli, 1997), on a LaCie high resolution monitor (1024 × 768 at 85 Hz) of 61 cd/m2 mean luminance, which was controlled by a Mac mini computer. One hundred and fifty equally spaced gray levels were used to maximize contrast linearity. At the test distance, the computer monitor subtended 13.4° by 10.1° of visual angle; one pixel was 0.018°. 
Procedure
A two-alternative forced choice (2AFC) procedure, using the method of constant stimuli, was used across all conditions. A target image was shown for 110 ms, followed by a low-level, Gaussian noise mask and then a uniform gray screen, each for 200 ms. The mask was created by applying the same band-pass filter used to create the synthetic faces to a two-dimensional binary noise array. The noise mask was used to remove any residual visual transient from the target exposure. Short target durations were used to minimize eye movements. Exposures of 90 ms have previously been shown to be sufficient for a face discrimination task; any further increase in target duration did not improve accuracy (Lehky, 2000; Veres-Injac & Persike, 2009). 
Following the offset of the gray screen, two images were presented side by side. One matched the target (Figure 3). To adjust task difficulty, the other (distractor) differed from the target by a specified amount, dependent upon observer sensitivity and condition. The observer was asked to indicate the target via computer mouse click. The two choices remained on the screen until the decision had been made. Participants were encouraged to respond quickly and to guess when uncertain. No feedback was provided. 
Figure 3
 
Procedure. Top: A single trial for the full face condition: A target face is shown for 110 ms, followed first by a noise mask, then a blank screen (200 ms each) and finally by two faces side by side. Observers had to select which of the two faces matched the target (2AFC). In this example, a face (righthand side in 2AFC) with a distinctiveness of 10% is the target, which has to be discriminated from the mean face (0% distinctiveness; distractor). Middle: (a) + (b) isolated feature conditions. In (a), an isolated external feature compound (lefthand side in 2AFC) has to be discriminated from a distractor external feature compound. The difference between target and distractor in all examples is 10% distinctiveness. (b) Illustrates an example of the isolated internal feature compound condition (target is lefthand side in 2AFC). Bottom: (c) + (d) Embedded feature conditions: A feature compound is embedded within an otherwise fixed face. (c) Illustrates an example of the embedded external feature compound condition: The difference between the target (left in 2AFC) and distractor lies solely in the external features; the internal features are identical and task-irrelevant. (d) Represents an example of the embedded internal feature condition: The difference between the target (left in 2AFC) and distractor is now restricted to the internal features; the external features are identical and task-irrelevant.
Figure 3
 
Procedure. Top: A single trial for the full face condition: A target face is shown for 110 ms, followed first by a noise mask, then a blank screen (200 ms each) and finally by two faces side by side. Observers had to select which of the two faces matched the target (2AFC). In this example, a face (righthand side in 2AFC) with a distinctiveness of 10% is the target, which has to be discriminated from the mean face (0% distinctiveness; distractor). Middle: (a) + (b) isolated feature conditions. In (a), an isolated external feature compound (lefthand side in 2AFC) has to be discriminated from a distractor external feature compound. The difference between target and distractor in all examples is 10% distinctiveness. (b) Illustrates an example of the isolated internal feature compound condition (target is lefthand side in 2AFC). Bottom: (c) + (d) Embedded feature conditions: A feature compound is embedded within an otherwise fixed face. (c) Illustrates an example of the embedded external feature compound condition: The difference between the target (left in 2AFC) and distractor lies solely in the external features; the internal features are identical and task-irrelevant. (d) Represents an example of the embedded internal feature condition: The difference between the target (left in 2AFC) and distractor is now restricted to the internal features; the external features are identical and task-irrelevant.
Within each experimental block, discrimination thresholds were measured for four face identities, presented randomly, using an interleaved design. Accordingly, observers were uncertain about the identity of the face on each trial. Discrimination accuracy for each identity was measured at six increments of face distinctiveness. Each level of distinctiveness was tested 20 times, resulting in 120 trials for each determination of threshold, and a total of 480 trials per experimental run. Data were fit by a Quick function (Quick, 1974) using a maximum likelihood procedure (separately for each identity). Discrimination thresholds were subsequently extracted from the fitted functions and defined as the distinctiveness value which was associated with 75% accuracy. 
Condition 1: Full faces
In the full face condition, observers were required to discriminate between the mean face and a face in which all of the features differed from the mean face by equal percentages (Figure 3, top). The relative weight of each feature was manipulated by the same amount (the same relative level of geometric difference from the corresponding feature within the mean face). We defined this difference as the “face distinctiveness” and expressed it as a percentage of the mean head radius. For example, a full face presented at 10% distinctiveness was created by morphing each component feature (head shape, hairline, nose, mouth, eyes, and eyebrows) to differ from that of the mean face by 10% of the mean head radius. The mean face was randomly assigned as the target face in 50% of trials. 
Condition 2: Isolated feature compounds
The procedure was the same as for the full face condition; however, observers were now asked to discriminate between feature compounds presented in isolation, rather than full faces. Two categories of feature compounds were tested: external features (head shape and hairline) and internal features (nose, mouth, eyes, and eyebrows; Figure 3a and b). 
Isolated feature compounds carried an identical distinctiveness signal to the same features when they were part of a full face. For example, the isolated internal feature compound at 20% distinctiveness was the internal features extracted from a full face at 20% distinctiveness. This allows direct comparison of sensitivities across any of the conditions tested in these experiments. The same four face identities were used in all conditions. Hence, the isolated condition presented feature compounds that were identical to those at the corresponding distinctiveness level for the full face. This approach allowed us to determine sensitivities for each feature and feature compound individually and simulate model predictions for how observers combine information by extracting individual feature gains and transducers from the data (see below). 
External and internal feature compounds of four different identities at varying levels of distinctiveness were presented randomly within experimental runs, using an interleaved design. This prevented observers from predicting which features would be tested on individual trials. 
Condition 3: Embedded feature compounds
This condition was identical to Condition 2, apart from the addition of a task-irrelevant, fixed-face context. Discrimination thresholds were measured for external and internal feature compounds embedded within fixed features of a generic face (Figure 3c and d). Only the feature compound of interest varied between the target and distractor; all other features were identical. For example, in Figure 3c, the difference between the target and the distractor lies solely in the external features (head shape and hairline); the internal face features in the target and distractor faces are identical. As in Condition 2, an interleaved design was used in which face identity and feature compounds were randomly intermixed. Therefore, observers could not predict which feature compound was tested on any individual trial. It should be noted that this approach differs from the composite face effect paradigm, in which observers are typically instructed to attend to a specific region of the face (Rossion, 2013). 
Statistical analysis
All statistical analyses utilized a one-factor, repeated measures analysis of variance (ANOVA), unless otherwise specified. Where Mauchly's test indicated that a violation of the sphericity assumption had occurred, the Greenhouse-Geisser correction was utilized. 
Results
There was no significant effect of face identity, F(3, 9) = 0.85; p = 0.500, or observer, F(3, 16) = 0.78; p = 0.521, on discrimination thresholds. Accordingly, face discrimination thresholds were averaged across face identity and average data are considered in all subsequent analyses. 
The mean full face discrimination threshold across observers was 5.37%. This value is in line with results of previous investigations of synthetic face discrimination (Loffler, Gordon, Wilkinson, Goren, & Wilson, 2005; Wilson et al., 2002). For example, Logan, Wilkinson, Wilson, Gordon, and Loffler (2016) reported a range between 3.33% and 8.84% for 52 typical observers. 
The full face condition served as a baseline to which all other conditions were compared. The data are therefore presented as threshold elevations, relative to thresholds for the full face condition. Threshold elevations are inversely proportional to sensitivity. 
Owing to the mathematical framework upon which synthetic faces are based, and the fact that the same face identities and features were used throughout, thresholds measured for different conditions are directly comparable. For example, a threshold elevation of 3.00 for an isolated nose indicates that, for reliable discrimination, observers required the difference between noses to be 3 times larger when they were presented in isolation than when they were part of a full face (in which all of the features changed by equivalent proportions). Similarly, a threshold elevation of 2.00 for an internal feature compound indicates that twice the difference is required between the internal features for reliable discrimination when presented on their own, compared to when part of a full face. 
Combining individual features into feature compounds
We have previously measured discrimination thresholds for a number of individual external (head shape, hairline) and internal (nose, mouth, eyes, eyebrows) features (Logan et al., 2017). In the first phase of the present study, we compared discrimination thresholds for these individual features with those for the associated external and internal feature compounds (Figure 1b). 
Combining internal features
Threshold elevations for individual features depended strongly on the features that were visible, F(1.6, 4.9) = 109.91; p < 0.001. Sensitivity to all of the individual internal features presented in isolation was significantly lower than that for the full face (pairwise comparisons with Bonferroni correction; all ps < 0.001). Threshold elevations ranged from 2.12x for the most sensitive (nose) to 4.47x for the least sensitive (eyebrows; Figure 4a). Importantly, sensitivity to the combined internal feature compound (M ± SD; 2.41 ± 0.34) was no better than that measured for the isolated feature to which observers were most sensitive (nose, 2.12 ± 0.46). 
Figure 4
 
Experimental data and model predictions for internal feature compounds. (a) Isolated internal features. Data are given as threshold elevations, relative to the full face condition (= 1). Measured sensitivity for each isolated internal feature (nose, mouth, eyes, and eyebrows) is shown in the light gray bars (Logan et al., 2017). Measured sensitivity for the internal feature compound is given in the dark gray bar. The white bar represents the sensitivity for the internal feature compound predicted by a single channel model, based on efficient integration of information from all individual features. This model predicts a significant improvement in sensitivity for the compound, relative to that for any of the internal features, including the most sensitive one (nose). Measured sensitivity to the isolated internal feature compound, however, was no better than that for the most sensitive feature. The model therefore overestimates sensitivity to the internal feature compound. (b) Embedded internal features. Sensitivities were typically lower when features were embedded in a task-irrelevant fixed face context but the pattern of relative sensitivities was comparable to the isolated condition. Again, the model predicts an improvement in sensitivity for the embedded internal feature compound, relative to that for the most sensitive feature (nose), but this overestimates sensitivity for the internal feature compound. The icons above the graph indicate the features under test, although the increased contrast in (b) is simply used to highlight the variable feature—there was no difference in contrast between features in the experiments. Observers were unaware of the features that were tested on individual trials. The error bars, here and elsewhere, denote 95% confidence intervals (N = 4).
Figure 4
 
Experimental data and model predictions for internal feature compounds. (a) Isolated internal features. Data are given as threshold elevations, relative to the full face condition (= 1). Measured sensitivity for each isolated internal feature (nose, mouth, eyes, and eyebrows) is shown in the light gray bars (Logan et al., 2017). Measured sensitivity for the internal feature compound is given in the dark gray bar. The white bar represents the sensitivity for the internal feature compound predicted by a single channel model, based on efficient integration of information from all individual features. This model predicts a significant improvement in sensitivity for the compound, relative to that for any of the internal features, including the most sensitive one (nose). Measured sensitivity to the isolated internal feature compound, however, was no better than that for the most sensitive feature. The model therefore overestimates sensitivity to the internal feature compound. (b) Embedded internal features. Sensitivities were typically lower when features were embedded in a task-irrelevant fixed face context but the pattern of relative sensitivities was comparable to the isolated condition. Again, the model predicts an improvement in sensitivity for the embedded internal feature compound, relative to that for the most sensitive feature (nose), but this overestimates sensitivity for the internal feature compound. The icons above the graph indicate the features under test, although the increased contrast in (b) is simply used to highlight the variable feature—there was no difference in contrast between features in the experiments. Observers were unaware of the features that were tested on individual trials. The error bars, here and elsewhere, denote 95% confidence intervals (N = 4).
Embedding individual features within a fixed face context generally results in poorer performance (Figure 4b). The overall pattern of results, however, was comparable to that for the isolated condition: Sensitivity to the compound (2.56 ± 0.31) was similar to that for the most sensitive feature (nose, 2.67 ± 0.76; Figure 4b). 
In sum, sensitivity to the internal feature compound was no greater than that for the best component, for both isolated and embedded conditions. 
Performance in the feature compound condition can be compared to model predictions based on sensitivities to the individual features. This can be used to describe the visual system's ability to integrate information from multiple sources when deriving a full face representation. 
The model we have implemented is based within the framework of signal detection theory (Green & Swets, 1966). Signal detection theory assumes that internal sensory responses for any external stimulus follow a Gaussian distribution, which represents the relative likelihood of a particular sensory response. The nature of the distribution is due to internal noise and the location of the Gaussian on the continuous internal response scale given by the strength of the external stimulus. In order to calculate performance in, for example, a 2AFC experimental set-up, one has to compare the noise-only distribution with the distribution when a certain external stimulus is present. The stronger the signal, the further the two distributions are separated. The separation is measured by d′. This framework can be used to calculate the expected sensitivity to a combination of stimuli based on sensitivities to individual stimuli. For example, sensitivity to a face feature compound can be based on that for individual face features. The first step in this calculation is to relate the strength of the internal signal to the strength of the external stimulus (e.g., Kingdom, Baldwin, & Schmidtmann, 2015):  
\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\unicode[Times]{x1D6C2}}\)\(\def\bupbeta{\unicode[Times]{x1D6C3}}\)\(\def\bupgamma{\unicode[Times]{x1D6C4}}\)\(\def\bupdelta{\unicode[Times]{x1D6C5}}\)\(\def\bupepsilon{\unicode[Times]{x1D6C6}}\)\(\def\bupvarepsilon{\unicode[Times]{x1D6DC}}\)\(\def\bupzeta{\unicode[Times]{x1D6C7}}\)\(\def\bupeta{\unicode[Times]{x1D6C8}}\)\(\def\buptheta{\unicode[Times]{x1D6C9}}\)\(\def\bupiota{\unicode[Times]{x1D6CA}}\)\(\def\bupkappa{\unicode[Times]{x1D6CB}}\)\(\def\buplambda{\unicode[Times]{x1D6CC}}\)\(\def\bupmu{\unicode[Times]{x1D6CD}}\)\(\def\bupnu{\unicode[Times]{x1D6CE}}\)\(\def\bupxi{\unicode[Times]{x1D6CF}}\)\(\def\bupomicron{\unicode[Times]{x1D6D0}}\)\(\def\buppi{\unicode[Times]{x1D6D1}}\)\(\def\buprho{\unicode[Times]{x1D6D2}}\)\(\def\bupsigma{\unicode[Times]{x1D6D4}}\)\(\def\buptau{\unicode[Times]{x1D6D5}}\)\(\def\bupupsilon{\unicode[Times]{x1D6D6}}\)\(\def\bupphi{\unicode[Times]{x1D6D7}}\)\(\def\bupchi{\unicode[Times]{x1D6D8}}\)\(\def\buppsy{\unicode[Times]{x1D6D9}}\)\(\def\bupomega{\unicode[Times]{x1D6DA}}\)\(\def\bupvartheta{\unicode[Times]{x1D6DD}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bUpsilon{\bf{\Upsilon}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\(\def\iGamma{\unicode[Times]{x1D6E4}}\)\(\def\iDelta{\unicode[Times]{x1D6E5}}\)\(\def\iTheta{\unicode[Times]{x1D6E9}}\)\(\def\iLambda{\unicode[Times]{x1D6EC}}\)\(\def\iXi{\unicode[Times]{x1D6EF}}\)\(\def\iPi{\unicode[Times]{x1D6F1}}\)\(\def\iSigma{\unicode[Times]{x1D6F4}}\)\(\def\iUpsilon{\unicode[Times]{x1D6F6}}\)\(\def\iPhi{\unicode[Times]{x1D6F7}}\)\(\def\iPsi{\unicode[Times]{x1D6F9}}\)\(\def\iOmega{\unicode[Times]{x1D6FA}}\)\(\def\biGamma{\unicode[Times]{x1D71E}}\)\(\def\biDelta{\unicode[Times]{x1D71F}}\)\(\def\biTheta{\unicode[Times]{x1D723}}\)\(\def\biLambda{\unicode[Times]{x1D726}}\)\(\def\biXi{\unicode[Times]{x1D729}}\)\(\def\biPi{\unicode[Times]{x1D72B}}\)\(\def\biSigma{\unicode[Times]{x1D72E}}\)\(\def\biUpsilon{\unicode[Times]{x1D730}}\)\(\def\biPhi{\unicode[Times]{x1D731}}\)\(\def\biPsi{\unicode[Times]{x1D733}}\)\(\def\biOmega{\unicode[Times]{x1D734}}\)\begin{equation}\tag{1}{{d}}{{\rm{^{\prime} }}_{{\rm{component}}}}{\rm{\ = \ }}{\left( {{{gs}}} \right)^{\rm{\uptau }}}\end{equation}
 
Equation 1 relates d′, the separation between the noise-only and signal distributions (in units of the standard deviation of the noise distribution), to the external stimulus strength (s) by a scaling factor (g, gain) and an exponent (τ, transducer). Separate gains and transducers were fit to the average data for each of the individual face features. The model's prediction for any combination of features can then be calculated using:  
\begin{equation}\tag{2}{{d^{\prime} }}{_{{\rm{compound}}}}{\rm{ = \ }}{{1} \over {\sqrt {{Q}} }}\mathop \sum \limits_{{{i = 1}}}^{{n}} {\left( {{{gs}}} \right)^{\rm{\uptau }}}\end{equation}
 
That is, the estimated d′ for the compound is found by summation of the signal strengths, scaled by the relevant gains and transducers, for each of the components (n). Q is the number of channels that are being monitored by the observer. The square root of the number of channels in the denominator reflects the fact that, when adding noise, the relevant parameter is the variance of the noise (see Kingdom et al., 2015). This equation can be applied to equal or unequal stimulus strength and to individual channels that differ in sensitivity (i.e., with regards to the gain and transducer), as is the case in our experiments. From the resulting d′ for the compound, one can then calculate percent correct responses for different input strengths and extrapolate a threshold from fitted psychometric functions. The model simulations were carried out by implementing several functions from the Palamedes Toolbox (Prins & Kingdom, 2018), in particular PAL_SDT_AS_uneqSLtoPC. The detailed mathematical derivations of the equations can be found in Kingdom et al. (2015) and Kingdom and Prins (2016). The simulations assumed a typical slope of the psychometric function of 2. 
In order to describe the predicted sensitivity for the case when observers make best use of the available information from individual sources (i.e., face features), we applied a model in which the d′ for each of the components are being additively combined by setting Q = 1. This is a single channel model with a single source of noise, which assumes that the signals from different face features are combined in the most efficient way (e.g., Kingdom et al., 2015; Kingdom & Prins, 2016; MacMillan & Creelman, 2005). The model assumes that observers combine information from all features; the contribution of each feature to the overall response is proportional to the sensitivity for that feature. Consequently, individual features that are associated with low levels of sensitivity contribute comparatively little to predicted sensitivity for the feature compound and vice versa. 
This model predicts enhanced sensitivity to the isolated internal feature compound, relative to all of the isolated features, including the most sensitive (nose; Figure 4a) and overestimates sensitivity to the isolated internal feature compound. Specifically, the threshold elevation predicted by the model for the isolated internal feature compound (1.26) was greater than that measured empirically (2.41 ± 0.33), and outside of the 95% confidence interval. 
The model also overestimated sensitivity for the embedded internal feature compound. As for isolated features, the threshold elevation predicted by the model (1.63) was significantly lower than that measured empirically (2.56 ± 0.31). 
Because sensitivity for the compound is no better than that for the most sensitive feature, our data do not support efficient signal integration across face features as predicted by the model. Rather, the results are consistent with a decision separability rule, where the most sensitive feature limits overall performance, with no apparent contribution from other sources of information. 
Combining external features
Overall, a similar pattern of results was found for the external features (Figure 5). Measured sensitivity for the isolated external feature compound (head shape and hairline; 0.87 ± 0.14) was no better than that for the best external feature alone (head shape; 0.84 ± 0.14). Sensitivity to the embedded external feature compound (0.96 ± 0.05) was likewise no better than that for the best embedded external feature (head shape; 0.94 ± 0.08). This is perhaps unsurprising since sensitivity for the head shape alone is as high as that for the full face, leaving little room for improvement. 
Figure 5
 
Experimental data and model predictions for external feature conditions. (a) Isolated external features. Measured sensitivity for each isolated external feature (head shape and hairline) is shown in the light gray bars. Measured sensitivity for the isolated external feature compound is given in the dark gray bar. The white bar represents the sensitivity predicted by a model based on efficient integration of available information. Sensitivity to the isolated external feature compound is comparable to that for, but no better than, the most sensitive isolated feature (head shape). (b) Embedded external features. The same pattern was found for the external features embedded within a fixed face context. Specifically, sensitivity to the most sensitive embedded individual feature (head shape) is comparable to that for the embedded external feature compound. Across both isolated and embedded conditions, sensitivity to the best component (head shape) is similar to that for full faces. In both cases, the model predicts sensitivity that is better than that of the best component and thus overestimates performance in the compound conditions.
Figure 5
 
Experimental data and model predictions for external feature conditions. (a) Isolated external features. Measured sensitivity for each isolated external feature (head shape and hairline) is shown in the light gray bars. Measured sensitivity for the isolated external feature compound is given in the dark gray bar. The white bar represents the sensitivity predicted by a model based on efficient integration of available information. Sensitivity to the isolated external feature compound is comparable to that for, but no better than, the most sensitive isolated feature (head shape). (b) Embedded external features. The same pattern was found for the external features embedded within a fixed face context. Specifically, sensitivity to the most sensitive embedded individual feature (head shape) is comparable to that for the embedded external feature compound. Across both isolated and embedded conditions, sensitivity to the best component (head shape) is similar to that for full faces. In both cases, the model predicts sensitivity that is better than that of the best component and thus overestimates performance in the compound conditions.
As for internal features, sensitivity to isolated external features predicted by the model (0.62) was considerably higher than that measured empirically (0.87 ± 0.14), and outside of the 95% confidence intervals. In the same way, the model overestimated sensitivity in the embedded condition. The predicted threshold elevation (0.78) was lower than that measured empirically (0.96 ± 0.05). 
In sum, performance for a compound of either internal or external face features is no better than that for the best individual component, both in isolation and embedded within a fixed-face context. This argues against a synergistic signal combination of available information in this type of face discrimination task. 
Combining feature compounds into full faces
We compared discrimination thresholds for external and internal feature compounds, both in isolation and embedded, with those for a full face (Figure 6). This investigates the second stage of a hypothetical feature integration model (Figure 1b). 
Figure 6
 
Sensitivity to feature compounds. Threshold elevations for internal (eyes, nose, mouth, brows) and external (head shape and hairline) feature compounds are presented relative to the full face condition (threshold elevation = 1.00; white bar), either in isolation (light bars) or embedded within an otherwise fixed face (dark bars). Thresholds were significantly elevated from the full face baseline for internal (isolated p = 0.024; embedded p = 0.006), but not external (p = 0.99) compounds. Thresholds did not differ between respective isolated and embedded conditions (p = 0.99) (denoted by “n.s.”). Asterisks indicate significant elevation from the full face baseline (pairwise comparisons with Bonferroni correction; p < 0.005).
Figure 6
 
Sensitivity to feature compounds. Threshold elevations for internal (eyes, nose, mouth, brows) and external (head shape and hairline) feature compounds are presented relative to the full face condition (threshold elevation = 1.00; white bar), either in isolation (light bars) or embedded within an otherwise fixed face (dark bars). Thresholds were significantly elevated from the full face baseline for internal (isolated p = 0.024; embedded p = 0.006), but not external (p = 0.99) compounds. Thresholds did not differ between respective isolated and embedded conditions (p = 0.99) (denoted by “n.s.”). Asterisks indicate significant elevation from the full face baseline (pairwise comparisons with Bonferroni correction; p < 0.005).
Threshold elevations for external features in isolation and embedded within the context of a task-irrelevant mean face were 0.87 ± 0.14 and 0.96 ± 0.05, respectively. Corresponding threshold elevations for the internal features were 2.41 ± 0.34 and 2.56 ± 0.31. Thus, observers were, on average, less than half as sensitive to the internal, relative to external, feature compounds. 
There was a significant main effect of face features on threshold elevations, F(1.2, 3.6) = 89.45; p < 0.001; ηp2 = 0.97. For internal features, thresholds for both the isolated (p = 0.024) and embedded (p = 0.006) compounds were significantly elevated from the full-face baseline (Figure 6). In contrast, there was no significant elevation of thresholds for the external compounds (both p = 0.99). This is suggestive of a disproportionate reliance on external features for unfamiliar face discrimination. 
Relative to presentation in isolation, embedding individual features (e.g., eyes, mouth) within a fixed-face context typically reduced discrimination sensitivity (see Figure 4). This is consistent with holistic processing where the combination of information from individual features impairs the ability to subsequently extract information about the individual, contributing parts (Logan et al., 2017). Our data for external and internal feature compounds, on the other hand, do not follow this pattern. There is no significant effect of context (isolated vs. embedded) when features are combined. A two-factor (context [isolated or embedded] and feature group [external or internal]), repeated-measures ANOVA showed no significant effect of context on threshold elevations, F(1, 3) = 2.51; p = 0.99. Moreover, there was no interaction between feature group and context, F(1, 3) =1.45; p = 0.296. 
We interpret the absence of a detrimental effect of embedding for feature compounds as an indication that external and internal features are processed independently of each other. This suggests that, for unfamiliar faces at least, holistic processing takes place between individual features as they are combined into separate external and internal representations, but not across these representations. 
In sum, while embedding individual features within a fixed-face context typically reduces discrimination sensitivity (Logan et al., 2017), we found no equivalent effect of embedding when either all internal or external features are presented in a compound. Adding fixed external features leaves thresholds for internal feature compounds essentially unaffected, and vice versa. This illustrates an important distinction between the processing of individual features and that of feature compounds: Adding task-irrelevant face parts diminishes discrimination sensitivity for the former, but not for the latter. 
Discussion
In line with previous reports, we found clear evidence of an external feature advantage for unfamiliar face discrimination (Bruce et al., 1999; Davies et al., 1977; Fraser et al., 1990; Haig, 1986; Nachson & Shechory, 2002; Veres-Injac & Persike, 2009). Specifically, sensitivity was approximately 2.8 times higher for external, relative to internal, feature compounds. These data extend our previous findings with individual face features to feature compounds (Logan et al., 2017). For individual features, sensitivity to the best internal feature (nose) was between 2.5 and 2.8 times (isolated and embedded feature respectively) poorer than that for the best external feature (head shape). Combining internal features did not improve sensitivity beyond that found for the most sensitive internal feature. Likewise, sensitivity to the external feature compound was equivalent to that for the most sensitive external feature. This suggests that there are separate limits for internal and external features. 
A similar pattern is found when combining internal and external features. Sensitivity to the full face is no better than that of the more sensitive compound (combined external features). Hence, we found no evidence of efficient integration of information from individual face features, nor of efficient integration of information from external and internal compounds. 
These findings are consistent with the premise that external and internal face features are processed largely independently, with a disproportionate reliance on external features for this type of unfamiliar face discrimination task. 
Combining individual face features
An efficient single channel model significantly overestimated sensitivity to feature compounds. Because performance for the combination of features is no better than that for the best component, our data are consistent with a decision separability rule, where the most sensitive feature provides the limit for compound sensitivity. Hence, while face features may be encoded holistically into external and internal representations, there is no evidence of significant facilitation by efficient integration of information from different features. 
Gold et al. (2012) also investigated signal integration across internal face features. In their paradigm, the minimum contrast required to match a low contrast feature to one of six alternatives was determined. Sensitivity to compound features was found to be higher than that for any of the individual features in isolation. This is perhaps not surprising for a contrast-dependent paradigm in which availability of multiple sources of information offers an obvious detection advantage over single sources (Kingdom et al., 2015). Importantly, unlike our results, Gold and colleagues found that sensitivity to the combined features was better than that for the best feature. Sensitivity to the combination of features in their contrast threshold paradigm was similar to, but no greater than, that predicted by an optimal Bayesian integrator. They concluded that, although observers could combine information from individual feature sources, the advantage was insufficient to support evidence in favor of significant synergistic signal integration between face features, in agreement with our conclusions. 
External and internal feature compounds
There is a detrimental effect of embedding individual features within a fixed face context: We found that discrimination thresholds for face features (e.g., nose, eyes) were significantly elevated when embedded within a fixed face context, relative to presentation in isolation (Logan et al., 2017). We empirically determined that this result could not be explained by task complexity (being confronted with multiple features), spatial uncertainty (deciding which of many features is modified) or attention (having to spread attention across an entire face rather than a single feature; Logan et al., 2017). Rather, we attribute this finding to an automatic, compulsory combination of face information that impairs subsequent extraction of information from individual features (i.e., holistic processing). 
It should be noted that this is not inconsistent with the established part–whole effect for familiar faces (Tanaka & Farah, 1993). In this paradigm, participants are initially familiarized with full faces. Recognition accuracy is then assessed for individual features of these learned faces (e.g., nose), presented either in isolation or embedded within a full face context. The part–whole effect describes the finding that recognition accuracy is greater for features embedded within the full face context. In the isolated feature condition of the present study, however, participants were not familiarized with full faces. Rather, participants were presented with a single feature (e.g., nose) and asked to match this to one of two alternatives. Consistent with our data, when observers are familiarized with individual features (rather than full faces), recognition accuracy is greater for features presented in isolation, compared to a full face context (Leder & Carbon, 2005). 
Gold et al. (2012) argued that for their experimental paradigm holistic processing should result in performance that is better (lower required contrast) than that of an optimal Bayesian integrator. As sensitivity to combined internal features was no better than the prediction of the model, they interpreted this as evidence against holistic processing. Rather than linking holistic processing to an increase in sensitivity for face combinations over components, however, we assume that for our discrimination task, holistic processing would result in a compulsory combination of individual features into a face complex, which thereby limits subsequent access to information from individual features. This process does not necessarily have to imply improved discrimination sensitivity when comparing isolated features to feature compounds. Sensitivity to individual features presented in a compound may instead be impaired, relative to presentation of these features in isolation—for example as a result of integrating noise from task-irrelevant features into a compound representation. Under this assumption, holistic processing results in better sensitivity to features presented in isolation relative to when these features are embedded in a fixed face context (Leder & Carbon, 2005). 
No such effect was seen for face compounds. The lack of a detrimental effect of embedding external or internal feature compounds indicates that there is no evidence of holistic processing in this case. This supports the existence of independent external and internal feature representations. Taken together, the results are consistent with holistic combination of individual internal and external features into internal and external representations (bottom to middle tier of Figure 1b), but their respective information maybe subsequently processed independently (middle to top tier of Figure 1b). 
It has been proposed that holistic processing is more strongly engaged for familiar, relative to unfamiliar, faces (Harris & Aguirre, 2008). As the present study employed unfamiliar faces, further studies are required to investigate if our results are specific to unfamiliar faces, or if they generalize to familiar ones. 
Recent fMRI studies reported that the external and internal features of synthetic faces are independently represented within face-sensitive human brain areas (i.e., the FFA and occipital face area [OFA]; Betts & Wilson, 2010; Nichols et al., 2010). The results of these neuroimaging studies may therefore provide an explanation for the results from the present behavioral study. If external and internal features are encoded by dissociable populations of neurons, one would expect to find behavioral evidence of independent processing. This is also in line with other behavioral investigation concerning the face prototype effect (i.e., erroneous perception of familiarity for a completely novel face that is made up of features taken from faces with which the observer has previously been familiarized (Cabeza, Bruce, Kato, & Oda, 1999; Solso & McCarthy, 1981). Or and Wilson (2013) used synthetic faces to demonstrate that equally strong, independent face prototype effects are found for both internal and external features presented in an otherwise fixed face context. This was taken as evidence of independent neuronal representation of external and internal features. 
There is further evidence that supports independent representation of external and internal features in face processing. First, both external and internal face features presented in isolation demonstrate a significant inversion effect, an established characteristic of face processing (Moscovitch & Moscovitch, 2000; Nachson & Shechory, 2002). Second, single cell recordings from face-sensitive regions within the superior temporal sulcus (STS) of the macaque monkey have identified neurons which respond selectively to individual features presented in isolation (e.g., eyes; Perrett, Rolls, & Caan, 1982). Finally, unlike noise patterns or nonface objects, isolated internal and external features both produce significant masking effects in an unfamiliar face discrimination task (Loffler, Gordon, et al., 2005). In sum, these results and those of the present study are in line with the premise that external and internal face features are processed independently. 
Synthetic faces
While the majority of previous face perception studies utilized face photographs as stimuli, the present study employed synthetic faces, which combine simplicity with sufficient realism to enable recognition of individual identities (Wilson et al., 2002). The simplicity of these synthetic faces enables the differences between individual identities to be manipulated in a quantifiable and controlled way. This metric is highly sensitive to individual differences in face discrimination ability (Logan et al., 2016). 
The synthetic face approach has some limitations. Firstly, due to their simplified nature, synthetic faces do not include all of the information available in faces or their photographs. Synthetic faces are focused upon salient face geometry (head shape, interocular separation, lip thickness), other aspects of face information (e.g., hair texture, skin surface reflectance) have been excluded. The rationale for this simplification is that humans readily recognize faces over long viewing distances (e.g., 5 m or more), despite significant reductions in the visibility of several aspects of face information (including hair texture and skin surface reflectance; Wilson et al., 2002). 
In order to generalize the results of the present study to everyday face processing tasks, one must show that synthetic faces engage the same processing mechanisms as real faces. Despite being simplified, there is considerable evidence that indicates that synthetic faces engage the same cortical processes as face photographs. First, Wilson et al. (2002) demonstrated that synthetic faces contain sufficient information to permit individual identification, which is robust to changes in face viewing angle. Synthetic faces also demonstrate behavioral hallmarks of face processing, including a significant face inversion effect (Logan et al., 2016; Wilson et al., 2002), external feature advantage for unfamiliar face discrimination (Logan et al., 2017) and left-over-right visual field bias (Schmidtmann, Logan, Kennedy, Gordon, & Loffler, 2015). Neuroimaging evidence indicates that synthetic faces and face photographs elicit a comparable BOLD fMRI signal in the FFA (Loffler, Yourganov, Wilkinson, & Wilson, 2005). Finally, patients with developmental prosopagnosia (a specific impairment of face perception) demonstrate reduced sensitivity to both face photographs and synthetic faces, but not nonface objects (e.g., cars; Lee, Duchaine, Wilson, & Nakayama, 2010; Logan et al., 2016). 
Conclusions
Sensitivity to unfamiliar face discrimination is significantly higher for external, relative to internal, features. Sensitivity to both internal and external feature compounds is no better than that predicted by sensitivity to the best individual component. This is inconsistent with the proposal that information is synergistically combined across multiple face features. Instead, the limiting factor for both external and internal feature combinations is the most sensitive single feature (internal: nose; external: head shape). In line with holistic processing, embedding individual features within a fixed face context, relative to presentation in isolation, significantly reduces discrimination sensitivity. We find no evidence that this face context disadvantage extends to internal and external feature compounds. Our results suggest that, while face features are holistically combined into internal and external representations, information from external and internal face features is processed independently. 
Acknowledgments
The authors thank Sara Rafique, Jennifer Reilly, and Heather Simpson for their assistance with data collection. 
Commercial relationships: none. 
Corresponding author: Andrew J. Logan. 
Address: Department of Vision Sciences, Glasgow Caledonian University, Glasgow, UK. 
References
Andrews, T. J., Davies-Thompson, J., Kingstone, A., & Young, A. W. (2010). Internal and external features of the face are represented holistically in face-selective regions of visual cortex. The Journal of Neuroscience, 30 (9), 3544–3552.
Axelrod, V., & Yovel, G. (2010). External facial features modify the representation of internal facial features in the fusiform face area. Neuroimage, 52 (2), 720–725.
Betts, L. R., & Wilson, H. R. (2010). Heterogeneous structure in face-selective human occipito-temporal cortex. Journal of Cognitive Neuroscience, 22 (10), 2276–2288.
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433–436.
Bruce, V., Henderson, Z., Greenwood, K., Hancock, P. J., Burton, A. M., & Miller, P. (1999). Verification of face identities from images captured on video. Journal of Experimental Psychology: Applied, 5 (4), 339.
Cabeza, R., Bruce, V., Kato, T., & Oda, M. (1999). The prototype effect in face recognition: Extension and limits. Memory & Cognition, 27 (1), 139–151.
Davies, G., Ellis, H., & Shepherd, J. (1977). Cue saliency in faces as assessed by the “Photofit” technique. Perception, 6, 263–269.
Fraser, I. H., Craig, G. L., & Parker, D. M. (1990). Reaction time measures of feature saliency in schematic faces. Perception, 19 (5), 661–673.
Freiwald, W. A., Tsao, D. Y., & Livingstone, M. S. (2009). A face feature space in the macaque temporal lobe. Nature Neuroscience, 12 (9), 1187–1196.
Gold, J. M., Barker, J. D., Barr, S., Bittner, J. L., Bratch, A., Bromfield, W. D.,… & Srinath, A. (2014). The perception of a familiar face is no more than the sum of its parts. Psychonomic Bulletin & Review, 21 (6), 1465–1472.
Gold, J. M., Mundy, P. J., & Tjan, B. S. (2012). The perception of a face is no more than the sum of its parts. Psychological Science, 23 (4), 427–434.
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics (Vol. 1). New York: Wiley.
Haig, N. D. (1986). Exploring recognition with interchanged facial features. Perception, 15 (3), 235–247.
Harris, A. M., & Aguirre, G. K. (2008). The effects of parts, wholes, and familiarity on face-selective responses in MEG. Journal of Vision, 8 (10): 4, 1–12, https://doi.org/10.1167/8.10.4. [PubMed] [Article]
Kingdom, F. A. A., Baldwin, A. S., & Schmidtmann, G. (2015). Modeling probability and additive summation for detection across multiple mechanisms under the assumptions of signal detection theory. Journal of Vision, 15 (5): 1, 1–16, https://doi.org/10.1167/15.5.1. [PubMed] [Article]
Kingdom, F. A. A., & Prins, N. (2016). Psychophysics: A practical introduction. London: Academic Press.
Leder, H., & Carbon, C.-C. (2005). When context hinders! Learn–test compatibility in face recognition. The Quarterly Journal of Experimental Psychology Section A, 58 (2), 235–250.
Lee, Y., Duchaine, B., Wilson, H. R., & Nakayama, K. (2010). Three cases of developmental prosopagnosia from one family: Detailed neuropsychological and psychophysical investigation of face processing. Cortex, 46 (8), 949–964.
Lehky, S. R. (2000). Fine discrimination of faces can be performed rapidly. Journal of Cognitive Neuroscience, 12 (5), 848–855.
Loffler, G., Gordon, G. E., Wilkinson, F., Goren, D., & Wilson, H. R. (2005). Configural masking of faces: Evidence for high-level interactions in face perception. Vision Research, 45 (17), 2287–2297.
Loffler, G., Yourganov, G., Wilkinson, F., & Wilson, H. R. (2005). fMRI evidence for the neural representation of faces. Nature Neuroscience, 8 (10), 1386–1391.
Logan, A. J., Gordon, G. E., & Loffler, G. (2017). Contributions of individual face features to face discrimination. Vision Research, 137, 29–39.
Logan, A. J., Wilkinson, F., Wilson, H. R., Gordon, G. E., & Loffler, G. (2016). The Caledonian face test: A new test of face discrimination. Vision Research, 119, 29–41.
Macmillan, N., & Creelman, C. (2005). Detection theory: A user's guide. Mahwah, NJ: Lawrence Earlbaum.
Maurer, D., Le Grand, R., & Mondloch, C. J. (2002). The many faces of configural processing. Trends in Cognitive Sciences, 6 (6), 255–260.
McKone, E., Kanwisher, N., & Duchaine, B. C. (2007). Can generic expertise explain special processing for faces? Trends in Cognitive Sciences, 11 (1), 8–15.
Mondloch, C. J., Pathman, T., Maurer, D., Le Grand, R., & de Schonen, S. (2007). The composite face effect in six-year-old children: Evidence of adult-like holistic face processing. Visual Cognition, 15 (5), 564–577.
Moscovitch, M., & Moscovitch, D. A. (2000). Super face-inversion effects for isolated internal or external features, and for fractured faces. Cognitive Neuropsychology, 17 (1–3), 201–219.
Nachson, I., & Shechory, M. (2002). Effect of inversion on the recognition of external and internal facial features. Acta Psychologica, 109 (3), 227–238.
Näsänen, R. (1999). Spatial frequency bandwidth used in the recognition of facial images. Vision Research, 39 (23), 3824–3833.
Nichols, D. F., Betts, L. R., & Wilson, H. R. (2010). Decoding of faces and face components in face-sensitive human visual cortex. Frontiers in Psychology, 1 (28), 1–13.
Or, C. C.-F., & Wilson, H. R. (2013). Implicit face prototype learning from geometric information. Vision Research, 82, 1–12.
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10 (4), 437–442.
Perrett, D., Rolls, E., & Caan, W. (1982). Visual neurones responsive to faces in the monkey temporal cortex. Experimental Brain Research, 47 (3), 329–342.
Prins, N., & Kingdom, F. A. A. (2018). Applying the model-comparison approach to test specific research hypotheses in psychophysical research using the Palamedes toolbox. Frontiers in Psychology, 9, 1250, https://doi.org/10.3389/fpsyg.2018.01250.
Quick, R. (1974). A vector-magnitude model of contrast detection. Kybernetik, 16 (2), 65–67.
Rossion, B. (2008). Picture-plane inversion leads to qualitative changes of face perception. Acta Psychologica, 128 (2), 274–289.
Rossion, B. (2013). The composite face illusion: A whole window into our understanding of holistic face perception. Visual Cognition, 21 (2), 139–253.
Schmidtmann, G., Logan, A. J., Kennedy, G. J., Gordon, G. E., & Loffler, G. (2015). Distinct lower visual field preference for object shape. Journal of Vision, 15 (5): 18, 1–15, https://doi.org/10.1167/15.5.18. [PubMed] [Article]
Shen, J., & Palmeri, T. J. (2015). The perception of a face can be greater than the sum of its parts. Psychonomic Bulletin & Review, 22 (3), 710–716.
Solso, R. L., & McCarthy, J. E. (1981). Prototype formation of faces: A case of pseudo-memory. British Journal of Psychology, 72 (4), 499–503.
Tanaka, J. W., & Farah, M. J. (1993). Parts and wholes in face recognition. The Quarterly Journal of Experimental Psychology, 46 (2), 225–245.
Veres-Injac, B., & Persike, M. (2009). Recognition of briefly presented familiar and unfamiliar faces. Psihologija, 42 (1), 47–66.
Wilson, H. R., Loffler, G., & Wilkinson, F. (2002). Synthetic faces, face cubes, and the geometry of face space. Vision Research, 42 (27), 2909–2923.
Young, A. W., Hellawell, D., & Hay, D. C. (1987). Configurational information in face perception. Perception, 16 (6), 747–759.
Figure 1
 
Schematic face feature hierarchy. The two hypothetical models represent alternatives ways in which information from individual face features may be combined into a full face representation. (a) A single stage model: Information from each individual face feature is directly combined into a full face representation. From left to right: Head shape, hairline, external features (head shape and hairline), nose, mouth, eyes, eyebrows, and internal features (nose, mouth, eyes, and eyebrows). The combination of information is represented by the summation sign. (b) A two-stage model: First, information from individual internal features (nose, mouth, eyes, and eyebrows) is integrated to form an internal feature compound. Similarly, information from the head shape and hairline is integrated into an external feature compound. Second, the external and internal feature compounds combine to form a full face.
Figure 1
 
Schematic face feature hierarchy. The two hypothetical models represent alternatives ways in which information from individual face features may be combined into a full face representation. (a) A single stage model: Information from each individual face feature is directly combined into a full face representation. From left to right: Head shape, hairline, external features (head shape and hairline), nose, mouth, eyes, eyebrows, and internal features (nose, mouth, eyes, and eyebrows). The combination of information is represented by the summation sign. (b) A two-stage model: First, information from individual internal features (nose, mouth, eyes, and eyebrows) is integrated to form an internal feature compound. Similarly, information from the head shape and hairline is integrated into an external feature compound. Second, the external and internal feature compounds combine to form a full face.
Figure 2
 
Synthetic faces. Top: (a) Grayscale photograph superimposed with a polar coordinate grid centered on the bridge of the nose. The head shape was measured at 16 locations around the external contour, angularly positioned at equal intervals of 22.5° (outermost small white circles), nine points in the upper half of the face captured hairline information. Head shape and hairline were defined by radial frequency (RF) contours, the parameters of which were derived by the positions of these points. The positions and shapes of the internal face features were defined by 14 additional measurements. The position of all features was idiosyncratic, as derived from the photograph. The shapes of the eyes and eyebrows were generic; those of the mouth and nose were individualized. In sum, each synthetic face is defined by 37 parameters and represented by a 37-dimensional vector (see Wilson et al., 2002 for further details). (b) Photograph filtered with a 2.0 octave bandwidth difference of Gaussian filter with peak spatial frequency of 10 c/face width. (c) Corresponding synthetic face. Bottom: Synthetic faces were adjusted by manipulating their distinctiveness, i.e., by how much they differ from the mean face (left). Increasing face distinctiveness results in individual faces becoming progressively more dissimilar (from middle to right) to the mean face. Distinctiveness is expressed as a percentage of mean head radius and quantifies the total geometric variation between the specified face and the mean face. Typical observers can discriminate a complete face from the mean at about 5% distinctiveness (Wilson et al., 2002).
Figure 2
 
Synthetic faces. Top: (a) Grayscale photograph superimposed with a polar coordinate grid centered on the bridge of the nose. The head shape was measured at 16 locations around the external contour, angularly positioned at equal intervals of 22.5° (outermost small white circles), nine points in the upper half of the face captured hairline information. Head shape and hairline were defined by radial frequency (RF) contours, the parameters of which were derived by the positions of these points. The positions and shapes of the internal face features were defined by 14 additional measurements. The position of all features was idiosyncratic, as derived from the photograph. The shapes of the eyes and eyebrows were generic; those of the mouth and nose were individualized. In sum, each synthetic face is defined by 37 parameters and represented by a 37-dimensional vector (see Wilson et al., 2002 for further details). (b) Photograph filtered with a 2.0 octave bandwidth difference of Gaussian filter with peak spatial frequency of 10 c/face width. (c) Corresponding synthetic face. Bottom: Synthetic faces were adjusted by manipulating their distinctiveness, i.e., by how much they differ from the mean face (left). Increasing face distinctiveness results in individual faces becoming progressively more dissimilar (from middle to right) to the mean face. Distinctiveness is expressed as a percentage of mean head radius and quantifies the total geometric variation between the specified face and the mean face. Typical observers can discriminate a complete face from the mean at about 5% distinctiveness (Wilson et al., 2002).
Figure 3
 
Procedure. Top: A single trial for the full face condition: A target face is shown for 110 ms, followed first by a noise mask, then a blank screen (200 ms each) and finally by two faces side by side. Observers had to select which of the two faces matched the target (2AFC). In this example, a face (righthand side in 2AFC) with a distinctiveness of 10% is the target, which has to be discriminated from the mean face (0% distinctiveness; distractor). Middle: (a) + (b) isolated feature conditions. In (a), an isolated external feature compound (lefthand side in 2AFC) has to be discriminated from a distractor external feature compound. The difference between target and distractor in all examples is 10% distinctiveness. (b) Illustrates an example of the isolated internal feature compound condition (target is lefthand side in 2AFC). Bottom: (c) + (d) Embedded feature conditions: A feature compound is embedded within an otherwise fixed face. (c) Illustrates an example of the embedded external feature compound condition: The difference between the target (left in 2AFC) and distractor lies solely in the external features; the internal features are identical and task-irrelevant. (d) Represents an example of the embedded internal feature condition: The difference between the target (left in 2AFC) and distractor is now restricted to the internal features; the external features are identical and task-irrelevant.
Figure 3
 
Procedure. Top: A single trial for the full face condition: A target face is shown for 110 ms, followed first by a noise mask, then a blank screen (200 ms each) and finally by two faces side by side. Observers had to select which of the two faces matched the target (2AFC). In this example, a face (righthand side in 2AFC) with a distinctiveness of 10% is the target, which has to be discriminated from the mean face (0% distinctiveness; distractor). Middle: (a) + (b) isolated feature conditions. In (a), an isolated external feature compound (lefthand side in 2AFC) has to be discriminated from a distractor external feature compound. The difference between target and distractor in all examples is 10% distinctiveness. (b) Illustrates an example of the isolated internal feature compound condition (target is lefthand side in 2AFC). Bottom: (c) + (d) Embedded feature conditions: A feature compound is embedded within an otherwise fixed face. (c) Illustrates an example of the embedded external feature compound condition: The difference between the target (left in 2AFC) and distractor lies solely in the external features; the internal features are identical and task-irrelevant. (d) Represents an example of the embedded internal feature condition: The difference between the target (left in 2AFC) and distractor is now restricted to the internal features; the external features are identical and task-irrelevant.
Figure 4
 
Experimental data and model predictions for internal feature compounds. (a) Isolated internal features. Data are given as threshold elevations, relative to the full face condition (= 1). Measured sensitivity for each isolated internal feature (nose, mouth, eyes, and eyebrows) is shown in the light gray bars (Logan et al., 2017). Measured sensitivity for the internal feature compound is given in the dark gray bar. The white bar represents the sensitivity for the internal feature compound predicted by a single channel model, based on efficient integration of information from all individual features. This model predicts a significant improvement in sensitivity for the compound, relative to that for any of the internal features, including the most sensitive one (nose). Measured sensitivity to the isolated internal feature compound, however, was no better than that for the most sensitive feature. The model therefore overestimates sensitivity to the internal feature compound. (b) Embedded internal features. Sensitivities were typically lower when features were embedded in a task-irrelevant fixed face context but the pattern of relative sensitivities was comparable to the isolated condition. Again, the model predicts an improvement in sensitivity for the embedded internal feature compound, relative to that for the most sensitive feature (nose), but this overestimates sensitivity for the internal feature compound. The icons above the graph indicate the features under test, although the increased contrast in (b) is simply used to highlight the variable feature—there was no difference in contrast between features in the experiments. Observers were unaware of the features that were tested on individual trials. The error bars, here and elsewhere, denote 95% confidence intervals (N = 4).
Figure 4
 
Experimental data and model predictions for internal feature compounds. (a) Isolated internal features. Data are given as threshold elevations, relative to the full face condition (= 1). Measured sensitivity for each isolated internal feature (nose, mouth, eyes, and eyebrows) is shown in the light gray bars (Logan et al., 2017). Measured sensitivity for the internal feature compound is given in the dark gray bar. The white bar represents the sensitivity for the internal feature compound predicted by a single channel model, based on efficient integration of information from all individual features. This model predicts a significant improvement in sensitivity for the compound, relative to that for any of the internal features, including the most sensitive one (nose). Measured sensitivity to the isolated internal feature compound, however, was no better than that for the most sensitive feature. The model therefore overestimates sensitivity to the internal feature compound. (b) Embedded internal features. Sensitivities were typically lower when features were embedded in a task-irrelevant fixed face context but the pattern of relative sensitivities was comparable to the isolated condition. Again, the model predicts an improvement in sensitivity for the embedded internal feature compound, relative to that for the most sensitive feature (nose), but this overestimates sensitivity for the internal feature compound. The icons above the graph indicate the features under test, although the increased contrast in (b) is simply used to highlight the variable feature—there was no difference in contrast between features in the experiments. Observers were unaware of the features that were tested on individual trials. The error bars, here and elsewhere, denote 95% confidence intervals (N = 4).
Figure 5
 
Experimental data and model predictions for external feature conditions. (a) Isolated external features. Measured sensitivity for each isolated external feature (head shape and hairline) is shown in the light gray bars. Measured sensitivity for the isolated external feature compound is given in the dark gray bar. The white bar represents the sensitivity predicted by a model based on efficient integration of available information. Sensitivity to the isolated external feature compound is comparable to that for, but no better than, the most sensitive isolated feature (head shape). (b) Embedded external features. The same pattern was found for the external features embedded within a fixed face context. Specifically, sensitivity to the most sensitive embedded individual feature (head shape) is comparable to that for the embedded external feature compound. Across both isolated and embedded conditions, sensitivity to the best component (head shape) is similar to that for full faces. In both cases, the model predicts sensitivity that is better than that of the best component and thus overestimates performance in the compound conditions.
Figure 5
 
Experimental data and model predictions for external feature conditions. (a) Isolated external features. Measured sensitivity for each isolated external feature (head shape and hairline) is shown in the light gray bars. Measured sensitivity for the isolated external feature compound is given in the dark gray bar. The white bar represents the sensitivity predicted by a model based on efficient integration of available information. Sensitivity to the isolated external feature compound is comparable to that for, but no better than, the most sensitive isolated feature (head shape). (b) Embedded external features. The same pattern was found for the external features embedded within a fixed face context. Specifically, sensitivity to the most sensitive embedded individual feature (head shape) is comparable to that for the embedded external feature compound. Across both isolated and embedded conditions, sensitivity to the best component (head shape) is similar to that for full faces. In both cases, the model predicts sensitivity that is better than that of the best component and thus overestimates performance in the compound conditions.
Figure 6
 
Sensitivity to feature compounds. Threshold elevations for internal (eyes, nose, mouth, brows) and external (head shape and hairline) feature compounds are presented relative to the full face condition (threshold elevation = 1.00; white bar), either in isolation (light bars) or embedded within an otherwise fixed face (dark bars). Thresholds were significantly elevated from the full face baseline for internal (isolated p = 0.024; embedded p = 0.006), but not external (p = 0.99) compounds. Thresholds did not differ between respective isolated and embedded conditions (p = 0.99) (denoted by “n.s.”). Asterisks indicate significant elevation from the full face baseline (pairwise comparisons with Bonferroni correction; p < 0.005).
Figure 6
 
Sensitivity to feature compounds. Threshold elevations for internal (eyes, nose, mouth, brows) and external (head shape and hairline) feature compounds are presented relative to the full face condition (threshold elevation = 1.00; white bar), either in isolation (light bars) or embedded within an otherwise fixed face (dark bars). Thresholds were significantly elevated from the full face baseline for internal (isolated p = 0.024; embedded p = 0.006), but not external (p = 0.99) compounds. Thresholds did not differ between respective isolated and embedded conditions (p = 0.99) (denoted by “n.s.”). Asterisks indicate significant elevation from the full face baseline (pairwise comparisons with Bonferroni correction; p < 0.005).
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×