Abstract
Given sufficient spatial context, humans easily identify an object in an image as well as determine the depth relationships between the surfaces that compose the image. Here we investigated the relationship between identification and relative depth estimation in natural images of single human body parts as a function of spatial extent. In the first experiment, the observer viewed an image of an elbow or knee through a square aperture ranging from 20 to 120 pixels (0.09 to 0.65 degrees) and was asked to identify the joint. A second experiment asked observers to decide the depth order between two points placed across a figure-ground boundary, a self-occluding boundary, or on the same surface. Same images were used for recognition and depth judgment. The images were taken from the Leeds Sports Dataset and human depth performance was evaluated with respect to ground truth defined by the Unite the People 3D mesh model. About 50 observers in each experiment were recruited online for each experiment. We also investigated the role of local intensity and contrast differences, and the spatial spectral slope of color channels in depth ordering estimation. We found that the correlation between depth ordering accuracy and recognition accuracy decreased with aperture sizes. Some apertured images had above-chance depth ordering accuracy while recognition accuracy was below chance level, indicating that part recognition was not a necessary condition for accurate depth judgments. Depth judgments were most accurate when points crossed figure-ground boundaries and worst when points crossed self-occluding boundaries. Contrary to our expectation, local intensity and contrast differences correlated negatively with depth ordering accuracy. The spatial spectral slopes of the luminance and blue-yellow channels correlated positively with depth ordering accuracy on the same surface at the biggest size, suggesting the possible contribution of skin color on the depth estimation of the body surface.