Responses for the changes in each configural position within the
sad group and within the
angry group are shown in
Figures 3A and
3B, respectively, with significant differences (
p < 0.01) denoted by an asterisk. The combined responses for the
sad group and for the
angry group are shown in
Figure 3C. The abscissa reflects the difference in positive or negative change level between the first and second image in each presentation. For example, if the first image represents a configural change of 75% and the second image is at a level of 25%, the subject's response would fall into the −50% grouping (negative change). Likewise, if the first image is at 50% and the second image is at 75%, the response would fall into the 25% grouping (positive change). The ordinate reflects the percentage of
less, same, and
more responses made at each grouping.
The patterns observed in the
more responses in
Figure 3 are sigmoidal as expected, since they approximate zero before the 0% and will saturate after 100%. This response type is common of many perceptual studies. Nonetheless, we note that the percentage of
more responses increases linearly with the amount of positive differences of change (i.e., from 0% to 100%). This is because the sigmoidal response approximates a linear function with very low slope in this interval. This means that, in the
sad group, if the distance between the baseline of the eyes and the mouth increases by
x, the percentage of
more responses increases by
f s(
x), where
f s(
.) is a linear function (
r 2 = 0.981). For the
angry group, when the distance between the eyes and the mouth decreases by
x, the percentage of
more response increases by
f a(
x), with
f a(
.) also linear (
r 2 = 0.986). Similarly, the percentage of
less responses increases linearly with the amount of negative change. In this case, for the
sad group, the
r 2 value is 0.995, while for the
angry group r 2 = 0.998. The percentage of
same responses (i.e., identical perception of sadness or anger in the first and second image) also decreases linearly. For the
sad group, the percentage of
same responses for negative change has an
r 2 value of 0.988, while the percentage of
same responses for positive change has an
r 2 value of 0.966. For the
angry group, these percentages are
r 2 = 0.991 and 0.971, respectively.
These results suggest an underlying dimension in a multidimensional face space where sadness and anger are represented as variations from a norm (prototypical or mean) face. This norm-based face space is well documented for the task of identification (Leopold, Bondar, & Giese,
2006; Rhodes & Jeffery,
2006; Valentine & Bruce,
1986), but has not been previously shown for the representation and recognition of expressions. This is a surprising finding, because the perception of facial expressions of emotion is purported to be categorical (Beale & Keil,
1995; Calder, Young, Perrett, Etcoff, & Rowland,
1996; Young et al.,
1997). Nonetheless, our results suggest that although the perception of emotion may be categorical, the underlying representation is (at least in part) continuous, in the sense that the perception of an emotion is made clearer as we move away from the norm face (center of the psychological space) and a perception of neutral expression is obtained at the “center” of this face space (Russell,
1980).
A
χ 2 goodness-of-fit test was applied to determine whether the responses at different levels of displacement are indeed statistically different from those at 0% change. To properly perform this test, the cumulative responses for the
sad group and for the
angry group were examined (
Figure 3C). For each of the
sad and
angry groups, the responses for the 0% change in displacement were used as the expected values. Therefore, the null hypothesis (H
0) states that there is no difference between the response profile for any given level of change in facial feature displacement when compared to the response profile for no change in displacement. All comparisons for both the
sad group and
angry group yielded a significant
χ 2 value for
p = 0.01 with two degrees of freedom (
χ cv 2 = 9.21;
df = 2). For both the
angry group and
sad group, the residuals indicate that the
less responses are the major contributors to the significant differences for the negative changes. Similarly, the
more responses are the major contributors to the significant differences for the positive changes. These results are again consistent with a norm-based representation.
It is well known that in many perceptual and psychophysical studies, the responses will be stronger at the extremes (Attneave,
1950), which is generally where the density of the underlying cognitive multidimensional space is lower (Krumhansl,
1978). This effect has been observed in the recognition of identity in face images (Benson & Perrett,
1994; Valentine & Ferrara,
1991). In these studies, faces closer to the mean face (i.e., the center of the face space) are more difficult to identify, while faces far from the mean are recognized more easily. Therefore, we can predict that if the representation of expression is indeed norm-based, then a similar pattern should be observed in our results. To test this hypothesis, we analyze the different responses to each of the configural changes from 25% to 75%. Note that each configural change includes several possible pairs. For example, the 25% change includes eight possible scenarios, because the variation between the image with 0% displacement and that of 25% is the same as those from 25% to 50%, 50% to 75%, 75% to 100%, and to the mirror pairs. We show the responses to a difference of 25% change for the
sad and
angry groups in
Figure 4A. In this plot, the abscissa values (
n%–
m%) illustrate the percentage of facial feature displacement in the first image (
n%) and that in the second image (
m%).
Our results are again consistent with a norm-based model. For positive changes (i.e., when
n <
m), we see a linear increment in the percentage of responses. That is, the more we move away from the center of the face space, the more apparent the percept becomes. When the same difference of change is on the face stimuli farthest from the mean face, the perception of sadness and anger is maximized. We also note, however, that the plot is asymmetric, because the same does not apply to the negative changes (i.e.,
n >
m). We specifically see this by looking at the difference in
same responses. Note that while on the positive side of the plot the percentage of same responses decreases linearly, on the negative side these are practically identical, i.e., the responses have flattened. A quantitative comparison of the (negative) same responses yields differences of less than 5% in all conditions. This means the direction of change is also important, since the perception of anger/sadness is most visible when the change is positive (
n <
m). The same is true for the plots at 50% and 75% change given in
Figures 4B and
4C. This suggests the face space is warped, also a known phenomenon in psychology (Tversky,
1977) and some face recognition tasks (Rotshtein, Henson, Treves, Driver, & Dolan,
2005), but previously not reported in the perception of facial expressions.
Asymmetries like the one observed in these results arise when the more salient of the two objects shown to the subject is placed in a different location in the psychological space and, hence, plays a distinct role. We note that when we go from large transformation (e.g., the extremes at 100%) to a smaller one (e.g., 25%), the density of points in the face space increases. This is because the second image is closer to the mean face, and the density of points representing faces increases as we approach the mean face. As we get closer to the dense areas, it becomes harder to distinguish between percepts, and the perception of sadness and anger diminishes. The opposite is true when we go from a denser to a less dense region.
We also performed an analysis to assess any learning that may have taken place over the course of an experimental session and may be responsible for the increased perception of sadness and anger. To assess the influence of learning on the subjects' responses, the percentages of correct responses were examined across the sequence of trials. Responses that reflected the actual change in deformation were considered correct, i.e., less responses were considered correct when the changes were negative, likewise, more responses were considered correct for positive changes. The data revealed a moderate improvement of around 5% from the first trial to the final one. To see whether this improvement could have affected the results reported above, we divided the data into two halves to see if this improvement was prominent in one of the conditions. We observed that all conditions had an almost identical increase, and thus the pattern of the data remained unaltered, i.e., a plot of the first half of the data and a plot of the second half have the exact same (linear) pattern, with the first-half plot being about 5% below that of the second-half. The reported data sits in between these two plots and still has the exact same observable pattern. Hence, although there was a small learning effect, this did not change the pattern of responses described in this section.