This study investigated whether individual differences in cognitive functions, attentional abilities in particular, were associated with individual differences in the quality of phonological representations, resulting in variability in speech perception and production. To do so, we took advantage of a tone merging phenomenon in Cantonese, and identified three groups of typically developed speakers who could differentiate the two rising tones (high and low rising) in both perception and production [+Per+Pro], only in perception [+Per-Pro], or in neither modalities [-Per-Pro]. Perception and production were reflected, respectively, by discrimination sensitivity d′ and acoustic measures of pitch offset and rise time differences. Components of event-related potential (ERP)-the mismatch negativity (MMN) and the ERPs to amplitude rise time-were taken to reflect the representations of the acoustic cues of tones. Components of attention and working memory in the auditory and visual modalities were assessed with published test batteries. The results show that individual differences in both perception and production are linked to how listeners encode and represent the acoustic cues (pitch contour and rise time) as reflected by ERPs. The present study has advanced our knowledge from previous work by integrating measures of perception, production, attention, and those reflecting quality of representation, to offer a comprehensive account for the underlying cognitive factors of individual differences in speech processing. Particularly, it is proposed that domain-general attentional switching affects the quality of perceptual representations of the acoustic cues, giving rise to individual differences in perception and production.