August 2023
Volume 23, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2023
Visual Analogy Between Object Parts
Author Affiliations
  • Hongjing Lu
    University of California, Los Angeles
  • Shuhao Fu
    University of California, Los Angeles
Journal of Vision August 2023, Vol.23, 5163. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Hongjing Lu, Shuhao Fu; Visual Analogy Between Object Parts. Journal of Vision 2023;23(9):5163.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

When asked, “If a tree had a knee, where would it be?” preschoolers can point to a sensible location. This example of visual analogical reasoning illustrates the human ability to find and exploit resemblances based on relations among entities, rather than solely on the entities themselves. But how do perception and reasoning systems work together to accomplish visual analogy from pixel-level inputs? To address this question, we developed a spatial mapping task to measure the consistency of human judgments in finding analogous parts between two images. In Experiment 1 we used synthetic images of vehicles (cars, buses, motorcycles, bikes) generated from 3D object models. Participants were shown one image with two markers (each indicating one part of an object) and asked to place markers on corresponding locations in a different image (size of 400 by 300 pixels). Humans showed consistent judgments in placing the markers, with small spatial variability (~15 pixels). Marker variability was lower when the two images were from the same rather than different object types, and when the images showed objects from similar rather than different viewpoints. In Experiment 2 we used the same task with realistic images of vehicles and obtained similar results. We developed a computational model for visual analogy, which first decomposes an object into parts using a deep learning model for semantic part segmentation, and then builds a structural representation using both visual features of parts and spatial relations between parts. These structural representations of objects are encoded as attributed graphs, and mapping is performed using a probabilistic graph matching algorithm. The model achieves close-to-human performance on the mapping task and predicts the influence of object type and viewpoints on mapping variability. These results support the essential role of structural representations of objects derived from raw images in performing downstream reasoning tasks.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.