Purchase this article with an account.
James Ryland, Alice O'Toole, Richard Golden; Orientation, Rotary Motion, and Congruency Effects: Models of Visual Object Identification. Journal of Vision 2015;15(12):1092. doi: 10.1167/15.12.1092.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
We developed a computational model of object identification (HAT-F) that is consistent with the Transformational Framework of Recognition (TFR) (Graf, 2006) and tested it as a model of human object recognition. The HAT-F model combines convolution, coordinate adjustment, and multiple-view templates. The TFR proposes that a hybrid of coordinate adjustment and multiple-views could account for human recognition phenomena that static retinotopic theories fail to predict. To test this claim, we specifically examined the accord between human and HAT-F’s object recognition performance: a) across novel orientations in the image plane, b) while rotating, and c) when preceded by unrelated primes in either congruent or incongruent orientations. In addition, we compared HAT-F against models consistent with invariant representation and multiple-view theories. We examined HAT-F’s accuracy and reaction times when classifying learned objects (N = 150) from the Revised Snodgrass set in unlearned image orientations (Rossion, 2004). Consistent with human behavior (e.g. Lawson, 2003), HAT-F’s accuracy varied according to a W-shaped curve over orientation (R2 = 0.9491); reaction time varied according to an inverse W-shaped curve (R2 = 0.9215). The multiple-view and invariant representation models displayed accuracy curves uncharacteristic of human object recognition behavior. The qualitative difference between accuracy curves for the three approaches was statistically reliable, F(22, 3278) = 61.85, MSe = 0.054, p < 1x10-15. Additionally, HAT-F was more accurate at recognizing objects preceded by primes in congruent orientations, F(1, 99) = 49.86, MSe = 0.163, p < 2.31x10-10 and at recognizing objects rotating towards an upright orientation, F(1, 149) = 22.39, MSe = 0.0788, p < 5.13x10-06. Both effects match human behavior (e.g. Graf, 2005; Jolicoeur, 1992). These results indicate that a combination of convolution, transformation, and multiple-view models can account for planar orientation effects, rotary motion effects, and orientation congruency effects, partially validating the TFR’s claim.
Meeting abstract presented at VSS 2015
This PDF is available to Subscribers Only