A significant neural challenge in speech perception includes extracting discrete phonetic categories from continuous and multidimensional signals despite varying task demands and surface-acoustic variability. While neural representations of speech categories have been previously identified in frontal and posterior temporal-parietal regions, the task dependency and dimensional specificity of these neural representations are still unclear. Here, we asked native Mandarin participants to listen to speech syllables carrying 4 distinct lexical tone categories across passive listening, repetition, and categorization tasks while they underwent functional magnetic resonance imaging (fMRI). We used searchlight classification and representational similarity analysis (RSA) to identify the dimensional structure underlying neural representation across tasks and surface-acoustic properties. Searchlight classification analyses revealed significant "cross-task" lexical tone decoding within the bilateral superior temporal gyrus (STG) and left inferior parietal lobule (LIPL). RSA revealed that the LIPL and LSTG, in contrast to the RSTG, relate to 2 critical dimensions (pitch height, pitch direction) underlying tone perception. Outside this core representational network, we found greater activation in the inferior frontal and parietal regions for stimuli that are more perceptually similar during tone categorization. Our findings reveal the specific characteristics of fronto-tempo-parietal regions that support speech representation and categorization processing.