Abstract
Recent research shows that deep neural networks (DNNs) trained for object recognition can predict neural responses to natural stimuli with unprecedented accuracy, serving as computational models of hierarchical visual processing along the ventral visual stream. Several functional brain datasets have been published to facilitate this DNN modeling approach, which compile neural responses to large-scale natural image datasets. However, their application in spatial cognitive processing, especially within the dorsal visual stream, remains underexplored. Here, we propose a novel dataset that combines fMRI and wideview stereoscopic presentation of natural 3D scenes, which reflects conditions known to facilitate spatial cognitive functions. The stimuli consisted of movie clips of indoor 3D scenes with 3D observer motion, generated using Habitat-Sim, a real-world simulator for training embodied AIs. To preserve geometrical accuracy in spatial 3D structure, the viewing angle and participant-wise interpupillary distance were set identically between rendering and presentation. Training and test data were acquired in separate scanning runs, each presenting the scene movie clips continuously. Preliminary results show voxels with high explainable variance across both ventral and dorsal visual cortical areas extending to the far periphery, indicating the potential of the dataset for quantitative and high-dimensional modeling of visuo-spatial processing involved in human spatial cognition.
 Funding: This research is supported by JSPS Kakenhi grants 21H04896