The current results indicate that crowding limitation can be specific to object category. In the literature, there is consensus that crowding reflects a limitation on object recognition (Levi,
2008; Pelli & Tillman,
2008; Whitney & Levi,
2011). Previous studies have investigated the effect of crowding when the target and flankers were in the same or different object categories (Huckauf, Heller, & Nazir,
1999; Reuther & Chakravarthi,
2014; Yeh, He, & Cavanagh,
2012) or when the target and flankers had the same or different shapes (Kooi, Toet, Tripathy, & Levi,
1994). However, little is known regarding how crowding is modulated by object categories (when the target and flankers are in the same object category) because it is rarely the focus of previous studies (but see Grainger, Tydgat, & Isselé,
2010). Recently, it has been proposed that crowding is independent of object categories (Pelli & Tillman,
2008). In contrast to this proposition, the present study demonstrates that crowding can be reduced specifically for musical stimuli and not for the nonmusical Landolt Cs by lab-induced training. This is consistent with the previous observation of smaller crowding specifically for musical notes in real-world music-reading experts compared with novices (Y. K. Wong & Gauthier,
2012). It is difficult to explain the category specificity of crowding by pure bottom-up accounts of crowding (e.g., general changes in receptive field size or long-range horizontal connections in the visual periphery; Levi & Waugh,
1994) or featural integration (Pelli & Tillman,
2008). Instead, the category specificity of crowding suggests that there is a top-down component of crowding, possibly contributed by higher level object representations, because perceptual expertise similar to that created in the present training typically leads to changes in the representation of the trained objects in the higher visual regions (Gauthier, Skudlarski, Gore, & Anderson,
2010; A. C.-N. Wong, Palmeri, Rogers, Gore, & Gauthier,
2009; A. C.-N. Wong et al.,
2012; see also Grainger et al.,
2010) and sometimes in both early and higher visual areas (Y. K. Wong, Folstein, & Gauthier,
2012; Y. K. Wong & Gauthier,
2010a). It is possible that by interacting with higher level representations of musical notation, crowding is alleviated by specifically (for musical notation) reducing inappropriate featural integration (Pelli & Tillman,
2008) or by enhancing the spatial resolution of receptive fields (S. He et al.,
1996; Tripathy & Cavanagh,
2002). Future work should clarify the mechanisms underlying the category-specific improvement in crowding, which provides important constraints to the theories of crowding in general.