September 2019
Volume 19, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2019
Extracting image statistics by human and machine observers
Author Affiliations & Notes
  • Chien-Chung Chen
    Department of Psychology, National Taiwan University
    Neurobiology and Cognitive Science Center, National Taiwan University
  • Hsiao Yuan Lin
    Department of Psychology, National Taiwan University
  • Charlie Chubb
    Department of Cognitive Sciences, University of California, Irvine
Journal of Vision September 2019, Vol.19, 14c. doi:https://doi.org/10.1167/19.10.14c
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Chien-Chung Chen, Hsiao Yuan Lin, Charlie Chubb; Extracting image statistics by human and machine observers. Journal of Vision 2019;19(10):14c. https://doi.org/10.1167/19.10.14c.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Machine learning has found many applications in research for higher visual functions. However, few studies have applied machine learning algorithms to understand early visual functions. We applied a deep convolution neural network (dCNN) to analyze human responses to basic image statistics. The stimuli were band-passed random dot textures whose pixel luminance distribution was modulated away from uniform by a linear combination of Legendre polynomials with orders i and j, where i = 2 or 3, and j was from 1 to 8 but not i. There were 30 modulation depths from 0 (uniform distribution) to 1 (some luminance level had zero probability) for each polynomial pair. The Gaussian spatial frequency bands had peaks that ranged from 2 to 32 cyc/deg and a half-octave space constant. Each of six observers classified 7500–22500 textures by contrast, skewness, glossiness, naturalness or aesthetic preference. The psychophysical results served as ground truth for a VGG16 dCNN pretrained for Imagenet. The decisive layer was identified by removing the convolution layers one by one and observing the point when validation accuracy of the network, with retrained output layer, dropped below 80%. The decisive layer for contrast discrimination contained filters with a profile that consisted of repeated geometric patterns, suggesting a general texture processing mechanism. The decisive layer was the same for all of glossiness, naturalness, and aesthetic preference and comprised filters whose profiles looked like parts of objects. The spatial frequency tuning function, assessed by the validation accuracy with one spatial frequency band left out from the training set was low-pass for all properties except contrast, which showed an inverted-W shape peaked at 4 and 16 cyc/deg. Our results suggest possible properties of the visual mechanisms used to sense texture qualities, and our analysis also shows that dCNN can be a useful tool for early vision research.

Acknowledgement: MOST(Taiwan) 105-2420-H-002 -006 -MY3 
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×