Purchase this article with an account.
Jayanth Koushik, Austin Marcus, Aarti Singh, Michael Tarr; Real-time Optimization for Visual Feature Identification. Journal of Vision 2018;18(10):413. doi: https://doi.org/10.1167/18.10.413.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Vision experiments typically use a small number of stimuli binned into discrete conditions. While effective when testing a small number of hypotheses, restricting the space of experimental stimuli is not optimal when studying visual representation. Particularly, the high dimensionality of feature spaces underlying visual categories implies that it is critical to explore larger stimulus spaces. For example, while some ventral-cortical regions in the human brain are understood to be selective for particular images classes, the underlying visual properties that lead to such selectivity are not well understood. To explicate which properties are important, choosing from a small stimulus set is likely to lead to poor feature identification. However, high-dimensional feature spaces are challenging to search thoroughly. To more efficiently identify the critical features driving brain responses, we adopted an active approach (Leeds et al. 2014, Leeds and Tarr 2016) using a closed-loop system consisting of: 1) collecting EEG signals for a given visual stimulus (an image drawn from a space of distorted face images, or, in a separate experiment, the space of all grayscale images at a given resolution); 2) a Bayesian optimization algorithm (Mockus 1975) that selects the next image to display, whereby it is expected to produce a larger response - in a given EEG time window (e.g. a 50 ms window starting at 170 ms post stimulus presentation) - compared to the previous image. This loop was repeated until we reached a maximum response value (or the system iterated through a fixed number of steps). Validating our method, for the space of distorted face images, results demonstrated replicable convergence towards less-distorted images consistent with the facial selectivity of the N170 ERP signal. For our larger space of unconstrained grayscale images, results revealed characteristics of the mid-level image features critical for visual processing.
Meeting abstract presented at VSS 2018
This PDF is available to Subscribers Only