Abstract
Global amplitude spectra from the discrete Fourier transform (DFT) have proven useful in studying behavioral and computational aspects of visual object and scene recognition. Here, we investigated whether Fourier phase (rather than amplitude) spectra may be useful for another purpose, namely guiding attentional selection. We developed a simple model which produces salience maps from phase information alone, by (1) downsampling images by one or more factors, (2) computing the DFT of each downsampled image's luminance, (3) normalizing each complex DFT value to unit amplitude while retaining its phase, (4) computing the inverse DFT, (5) squaring the result, and (6) combining the maps resulting from each downsampling factor. Salience maps from this model significantly predicted the free-viewing gaze patterns of four observers for 337 images of natural outdoor scenes, fractals, and aerial imagery. For fractals and aerial imagery, this phase-based model was significantly better (paired t-test, ppf power spectra, so forcing a flat Fourier amplitude spectrum is similar to scaling the amplitude everywhere by f, equivalent to a spatial derivative. However, this derivative-like aspect cannot completely explain our results, because the image category with the most 1/f-like spectrum (outdoor scenes) was the one for which the phase-only model fared worst. Just as Fourier amplitude can form a computational basis for scene categorization (Torralba 2003), our results establish Fourier phase information as one possible computational basis for spatial attentional selection.