Abstract
Inverted encoding models (IEMs) have recently become popular as a method for decoding stimuli, notably used to reconstruct perceptual and mnemonic visual features (e.g., Brouwer & Heeger, 2009; Sprague, Ester, & Serences, 2016). We demonstrate that current evaluations of IEMs could produce spurious conclusions because of a failure to account for underlying channel response profiles (basis set) of the encoder. Our novel modification to IEMs solves this problem and further leads to improved decoding interpretability and the capacity for trial-by-trial goodness-of-fits. The advantages of IEMs are that the model is based on population-level tuning functions (aka channel response profiles), which is thought to better reflect underlying neuronal tuning than similar decoding models, and that decoding with IEMs occurs on a continuous scale such that stimuli not used to train the model may be predicted. We argue that IEMs remain a powerful method to detect stimulus-specific information, however, the means by which IEMs are currently evaluated is problematic. Currently, IEMs are measured via reconstructed channel response profiles (“reconstructions”) which are averaged and aligned across trials. Using simulations and fMRI data from studies of visual perception and attention, we show that the standard measures for evaluating reconstructions (e.g., slope, amplitude, bandwidth) do not take into account the assumed channel response profiles of the encoder. This is important because a significantly steeper slope may not necessarily imply more stimulus-specific information in the brain region. That is, a reconstruction that “looks” better (e.g., higher amplitude, lower standard deviation) can sometimes reflect less stimulus-specific information than a relatively worse-looking reconstruction (even if the reconstructions come from the same basis set). Our method solves these problems and additionally provides a means for improved decoding interpretability (reconstructions in stimulus space rather than channel space) and trial-by-trial goodness-of-fit estimates (useful for excluding noisy trials to increase statistical power).