SummaryPercepts and words can be decoded from largely distributed neural activity measures. The existence of widespread representations might, however, conflict with the fundamental notions of hierarchical processing and efficient coding. Using fMRI and MEG during syllable identification, we first show that sensory and decisional activity co-localize to a restricted part of the posterior superior temporal cortex. Next, using intracortical recordings we demonstrate that early and focal neural activity in this region distinguishes correct from incorrect decisions and can be machine-decoded to classify syllables. Crucially, significant machine-decoding was possible from neuronal activity sampled across widespread regions, despite weak or absent sensory or decision-related responses. These findings show that a complex behavior like speech sound categorization relies on an efficient readout of focal neural activity, while distributed activity, although decodable by machine-learning, reflects collateral processes of sensory perception and decision.