Abstract
Recognizing faces regardless of viewpoint is critical for social interactions. Evidence from single-neuron electrophysiological recordings in macaques suggests a three-step architecture revealing a sharp transition from a strictly view-tuned representation in the macaque middle-lateral/middle-fundus (ML/MF) face patches to a mirror-symmetric representation in the anterior-lateral (AL) face-patch, before achieving viewpoint invariance in the anterior-medial (AM) face-patch, at the highest level of the hierarchy. However, human studies combining functional magnetic resonance imaging (fMRI) and Representational Similarity Analysis (RSA) have led to divergent conclusions in all core face selective areas, including the Fusiform Face Area (FFA). This makes it hard to relate observations within and across species. We previously proposed a geometric configuration in multivariate space that accounts for divergent observations in human FFA. Here, by considering the impact on RSA of signal imbalances across conditions and measurement scale, we show that this geometric configuration is compatible with observations in macaque area ML/MF, but not AL. Our account shows that key assumptions of RSA sometimes break down. Specifically, we show that inferences about neuronal coding with RSA are influenced by translation and rotation of the data. We also show that abstracting from the measurement process and relying directly on the rank-order of entries of dissimilarity matrices to relate representations across species and techniques leads to error when marked signal-imbalances are observed across conditions. We demonstrate with biologically-motivated network models, forward models, as well as previously published empirical fMRI data and single-cell monkey electrophysiological recordings that it is necessary to consider details of the measurement process to validly relate measurements across species and techniques. These findings suggest limitations in RSA, urging a nuanced approach for cross-species comparisons, and support the idea that human FFA is view-tuned like macaque area ML/MF, rather than mirror-symmetrically tuned like area AL.