Abstract
Humans are very good at recognizing objects from just their 2D outlines. Previous work has modeled discrimination as correlation, using linear systems identification methods to identify internal shape templates (Kurki et al, 2014, JOV; Wilder et al, 2015, Perception). However, Wilder et al. also noted evidence for nonlinearities in human shape discrimination that are not accounted for by the linear correlation model. One function of these nonlinearities may be to extract high-frequency shape information despite pose uncertainty. To test this hypothesis, we reconsider the experiment conducted by Wilder et al. in which human observers discriminated between shapes corrupted with additive Gaussian coordinate noise. A linear model that assumes no internal pose uncertainty can be estimated from these data and 56% of the variance of human responses. If the linear model is forced to account for a large degree of internal pose uncertainty (up to 40% in-plane rotation), explained variance drops to 36% and is only marginally better than chance. However, a deep neural network (DNN) model trained on the human responses completely recovers this lost variance. By analyzing the gradient of the DNN output with respect to the input, we show that the DNN model achieves this by undoing these random internal pose variations to yield a shape representation that is roughly pose invariant. Most importantly, these gradients also show a sensitivity to higher shape frequencies that is not revealed by linear systems identification methods. A DNN model reveals nonlinearities in human shape discrimination. These nonlinearities allow higher shape frequencies to be used for shape discrimination despite substantial amounts of internal pose uncertainty.
Meeting abstract presented at VSS 2018