Abstract
Human object recognition is robust to a variety of object transformations, including changes in lighting, rotations, and translations, as well as other image manipulations, including the addition of various forms of noise. Invariance has been shown to emerge gradually along the ventral visual stream with later regions showing higher tolerance to object transformations. In contrast, despite their unprecedented performance on numerous visual tasks, Deep Neural Networks (DNNs) fall short in achieving human-level robustness to image perturbations (adversarial attacks), even those that are visually imperceptible to humans. One potential explanation for this difference is that brains, but not DNNs, build increasingly disentangled and therefore robust object representations with each successive stage of the ventral visual stream. Here, we asked whether training DNNs to emulate human representation can enhance their robustness and, more importantly, whether different stages of the ventral visual stream enable progressively increased robustness, reflecting the potentially evolving representation crucial for human perceptual invariance. We extracted neural activity patterns in five hierarchical regions of interest (ROIs) in the ventral visual stream: V1, V2, V4, LO, and TO from a 7T fMRI dataset (Allen et al., 2022) obtained when human participants viewed natural images. DNN models were trained to perform image classification tasks while aligning their penultimate layer representations with neural activity from each ROI. Our findings reveal not only a significant improvement in DNN robustness but also a hierarchical effect: greater robustness gains were observed when trained with neural representations from later stages of the visual hierarchy. Our results not only show that ventral visual cortex representations improve DNN robustness but also support the gradual emergence of robustness along the ventral visual stream.