For the baseline task, observers viewed all 300 windowed material images presented foveally and did the classification task as described in the Procedure section. It is important to get a baseline measure of performance for several reasons. If we are to examine texture as a cue for material category, we need to know how well observers can tell category with all cues present (i.e., the original materials). To our knowledge, there has not been a study of untimed, grayscale material recognition with the MIT-Flickr database using the subset of images we choose here. Importantly, it is not obvious that observers will be perfect at this task. The images come from a wide range of three-dimensional shapes, object identities, surface reflectances, physical scales, and illuminations, even within a category. Our later experiments compare performance in this baseline condition to performance under degraded viewing conditions. If observers are less able to categorize materials with the textures than the baseline, this would imply that texture is not a sufficient cue for category. In other words, this finding would imply that the information lost by converting a baseline material to a P-S texture (e.g., shape or large-scale layout information) is necessary for robust material classification. If, on the other hand, texture classification performance is indistinguishable from baseline performance, we cannot draw definitive conclusions about the necessity of texture for material classification.