Journal of Vision Cover Image for Volume 24, Issue 10
September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
Spurious reconstruction from brain activity: The thin line between reconstruction, classification, and hallucination
Author Affiliations & Notes
  • Ken Shirakawa
    Graduate School of Informatics, Kyoto University
    ATR Computational Neuroscience Laboratories
  • Yoshihiro Nagano
    Graduate School of Informatics, Kyoto University
    ATR Computational Neuroscience Laboratories
  • Misato Tanaka
    Graduate School of Informatics, Kyoto University
    ATR Computational Neuroscience Laboratories
  • Shuntaro C. Aoki
    Graduate School of Informatics, Kyoto University
    ATR Computational Neuroscience Laboratories
  • Kei Majima
    National Institutes for Quantum Science and Technology
  • Yusuke Muraki
    Graduate School of Informatics, Kyoto University
  • Yukiyasu Kamitani
    Graduate School of Informatics, Kyoto University
    ATR Computational Neuroscience Laboratories
  • Footnotes
    Acknowledgements  This work was supported by the JSPS (KAKENHI grants JP20H05954, JP20H05705, JP21K17821 and 22KJ1801), JST (CREST grants JPMJCR18A5, and JPMJCR22P3), and NEDO (commissioned project, JPNP20006)
Journal of Vision September 2024, Vol.24, 321. doi:https://doi.org/10.1167/jov.24.10.321
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Ken Shirakawa, Yoshihiro Nagano, Misato Tanaka, Shuntaro C. Aoki, Kei Majima, Yusuke Muraki, Yukiyasu Kamitani; Spurious reconstruction from brain activity: The thin line between reconstruction, classification, and hallucination. Journal of Vision 2024;24(10):321. https://doi.org/10.1167/jov.24.10.321.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Visual image reconstruction aims to recover arbitrary stimulus/perceived images from brain activity. To achieve reconstruction over diverse images, especially with limited training data, it is crucial that the model leverages a compositional representation that spans the image space, with each feature effectively mapped to brain activity. In light of these considerations, we critically assessed recently reported photorealistic reconstructions based on text-to-image diffusion models applied to a large-scale fMRI/stimulus dataset (Natural Scene Dataset, NSD). We found a notable decrease in the reconstruction performance of these models with a different dataset (Deeprecon) specifically designed to prevent category overlaps between the training and test sets. UMAP visualization of the target features (CLIP text/semantic features) with NSD images revealed a strikingly limited diversity with only ~40 distinct semantic clusters overlapping between the training and test sets. Further, CLIP feature decoders trained on NSD highlighted significant challenges in predicting novel semantic clusters not present in the training set. Simulations also revealed the inability to predict new clusters when the training set was restricted to a small number of clusters. Clustered training samples appear to restrict the feature dimensions that could be predicted from brain activity. Conversely, by diversifying the training set to ensure a broader distribution in the feature dimensions, the decoders exhibited improved generalizability beyond the trained clusters. Nonetheless, it is important to note that text/semantic features alone are insufficient for a complete mapping to the visual space, even if they are perfectly predicted from brain activity. Building on these observations, we argue that the recent photorealistic reconstructions may predominantly be a blend of classification into trained semantic categories and the generation of convincing yet inauthentic images (hallucinations) through text-to-image diffusion. To avoid such spurious reconstructions, we offer guidelines for developing generalizable methods and conducting reliable evaluations.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×