Abstract
Object encoding has been the focus of computational modeling and neuro-physiological studies in the last 30 years. Neuro-physiological studies have revealed much information about the processing of object recognition from V1 to V2, V4, and to the IT area, but computational modeling has not incorporated this information. We propose a scheme of object representation that incorporates a number of facts about the neural codes of objects. In this scheme, an object is encoded by a hierarchical model that includes three probability distributions (PDs), i.e., PD of object parts and geometry, PD of natural object structures (NOS), and PD of spatial arrangements of NOS. Here, NOS are topology-conserving, multi-view, multi-scale, spatial concatenations of 2D and 3D object features (depth, surface orientation, and surface curvature). We took five steps to compile NOS: 1), sample a large number of patches of natural objects using a hexagonal configuration; 2) perform independent component analysis on the circular patches and obtain independent components (ICs); 3) fit Gabor functions to the ICs and classify the ICs into 16 orientations and a set of clusters for each orientation; 4), project the circular patches to the clusters of ICs and compute the features of the circular patches; and 5), partition the space of feature vectors into a set of NOS. Using this procedure, we obtained a large number of NOS from a dataset of 200 small natural objects acquired by a laser range scanner and a digital camera. These NOS provide a classification of natural object patches and include all concatenations of 2D and 3D object features. Since NOS include 2D and 3D features, this model unifies structural, 3D-based and appearance-based approaches to object recognition. We tested this model of 3D object recognition and found that the performance comparable to or better than current models of 3D object recognition.
Meeting abstract presented at VSS 2014