Journal of Vision Cover Image for Volume 24, Issue 10
September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Visual Inputs Reconstructing through Enhanced 3T fMRI Data from Optimal Transport Guided Generative Adversarial Network
Author Affiliations & Notes
  • Yujian Xiong
    Arizona State University
  • Wenhui Zhu
    Arizona State University
  • Yalin Wang
    Arizona State University
  • Zhong-Lin Lu
    NYU Shanghai
    New York University
  • Footnotes
    Acknowledgements  The work was partially supported by NSF (DMS-1413417 \& DMS-1412722) and NIH (R01EY032125 \& R01DE030286).
Journal of Vision September 2024, Vol.24, 1478. doi:https://doi.org/10.1167/jov.24.10.1478
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Yujian Xiong, Wenhui Zhu, Yalin Wang, Zhong-Lin Lu; Visual Inputs Reconstructing through Enhanced 3T fMRI Data from Optimal Transport Guided Generative Adversarial Network. Journal of Vision 2024;24(10):1478. https://doi.org/10.1167/jov.24.10.1478.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Unraveling the intricacies of the human visual system via the reconstruction of visual inputs from functional Magnetic Resonance Imaging (fMRI) has seen significant strides with deep learning. However, the persistent demand for high-quality, subject-specific 7-Tesla (7T) fMRI experiments poses challenges. Integrating smaller 3-Tesla (3T) datasets or accommodating subjects with short, low-quality scans remains a hurdle. Here we propose a novel framework employing an Optimal Transportation Guided Generative Adversarial Network (GAN) to enhance 3T fMRI, surmount limitations in scarce 7T data and challenges associated with short, low-quality 3T scans which have less burden for subjects. Our model, the OT Guided GAN, comprises a six-layered U-Net designed to enhance 3T fMRI scans to a quality comparable to the original 7T scans. Training is conducted across 17 subjects in two datasets with distinct experimental conditions: the 7T Natural Scenes Dataset and the 3T Natural Object Dataset. Shared input images between these datasets consist of a common set viewed by both 3T and 7T subjects, enabling an unsupervised training scenario. Subsequently, two linear regression models transform the combined set of original 7T and enhanced 3T fMRI for input into the pre-trained Stable Diffusion model, facilitating the reconstruction of visual input images. We test the framework’s ability to reconstruct visual input images of natural scenes from an untrained 3T subject. The capabilities of the enhanced 3T fMRI data are demonstrated through the Fréchet Inception Distance (FID) score and human judgment, underscoring its proficiency in generating superior input visual images compared to recent methods that demand extensive 7T data. Once the framework is adequately trained, it can enhance any new subject with only 3T fMRI beyond the training set, utilizing the improved results to excel in demanding data tasks with superior performance.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×