September 2024
Volume 24, Issue 10
Open Access
Vision Sciences Society Annual Meeting Abstract  |   September 2024
The impact of semantic descriptions on learning object-to-object relationships in a scene
Author Affiliations & Notes
  • Victoria Nicholls
    Goethe University Frankfurt
  • Kim Kira Philippi
    Goethe University Frankfurt
  • Mohamed Abdelrahman
    Goethe University Frankfurt
  • Jo Ackerman
    Goethe University Frankfurt
  • Franka Schulze
    Goethe University Frankfurt
  • Melissa Le-Hoa Vo
    Goethe University Frankfurt
  • Footnotes
    Acknowledgements  This work was supported by the Deutsche Forschungsgemeinschaft (DFG-FOR German Research Foundation;Arena)
Journal of Vision September 2024, Vol.24, 1254. doi:https://doi.org/10.1167/jov.24.10.1254
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Victoria Nicholls, Kim Kira Philippi, Mohamed Abdelrahman, Jo Ackerman, Franka Schulze, Melissa Le-Hoa Vo; The impact of semantic descriptions on learning object-to-object relationships in a scene. Journal of Vision 2024;24(10):1254. https://doi.org/10.1167/jov.24.10.1254.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Our knowledge of scenes is thought to have a hierarchical structure, at the lowest level are local objects, often smaller objects such as soap. Followed by anchor objects, often larger objects, e.g. sinks. Local and anchor objects together e.g. soap on a sink, form a phrase. Phrases can have multiple local objects (co-locals). Multiple phrases combined form a scene. What is not clear is how we learn this structure, can this be learned with visual associations alone or is semantic object information required? To examine this, we performed two experiments. In the learning phase of the first experiment participants were presented with objects in isolation with audio descriptions of the object functions, or with non-descriptive audio. This was followed by two recall phases, the first where participants were presented with two objects which they rated on a scale from 1-9 how likely the objects would be grouped together based on the descriptions they received in the learning phase. In the second recall phase participants were shown a scene image containing all the objects and participants grouped the objects into phrases based on the object descriptions received. In the learning phase of the second experiment participants viewed videos of phrases in scenes, where each object was highlighted, along with descriptive or non-descriptive audio. This was followed by the same rating recall phase as in the first experiment. In the video conditions we found that participants learned the anchor-local relationships even with non-descriptive audio, while descriptive audio boosted learning the local- to-local relationships. This suggests that hierarchical scene knowledge can be learned through visual associations but the detail of the knowledge can be improved with the inclusion of semantic information such as descriptions of functions the objects perform together.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×