Listeners must accomplish two complementary perceptual feats in extracting a message from speech. They must discriminate linguistically-relevant acoustic variability and generalize across irrelevant variability. Said another way, they must categorize speech. Since the mapping of acoustic variability is language-specific, these categories must be learned from experience. Thus, understanding how, in general, the auditory system acquires and represents categories can inform us about the toolbox of mechanisms available to speech perception. This perspective invites consideration of findings from cognitive neuroscience literatures outside of the speech domain as a means of constraining models of speech perception. Although neurobiological models of speech perception have mainly focused on cerebral cortex, research outside the speech domain is consistent with the possibility of significant subcortical contributions in category learning. Here, we review the functional role of one such structure, the basal ganglia. We examine research from animal electrophysiology, human neuroimaging, and behavior to consider characteristics of basal ganglia processing that may be advantageous for speech category learning. We also present emerging evidence for a direct role for basal ganglia in learning auditory categories in a complex, naturalistic task intended to model the incidental manner in which speech categories are acquired. To conclude, we highlight new research questions that arise in incorporating the broader neuroscience research literature in modeling speech perception, and suggest how understanding contributions of the basal ganglia can inform attempts to optimize training protocols for learning non-native speech categories in adulthood.