Using a cross-modal picture-word interference (PWI) task, we examined phonological representations and encoding in Mandarin-speaking children and adults. Pictures of monosyllabic words were presented visually, with auditory primes presented before, concurrent with, or after the picture’s appearance (SOA -200, -100, 0, +150). Primes were related to the targets in terms of Onset, Rhyme, Tone, Onset and Tone, Rhyme and Tone, or were unrelated. The rhymes of target words were counterbalanced between simple and complex structures to examine effects of rhyme complexity. Twenty Mandarin-speaking adults (aged 20;3 to 23;10), 20 school-age children (aged 9;1 to 10;11) and 20 preschoolers (aged 5;0 to 5;11) were asked to name the pictures as quickly as possible while ignoring the primes played over a headset. The results showed that adults exhibited consistent Onset and Onset-Tone priming effects across later SOAs, while the older children (9- to 10-year-olds) exhibited Onset, Rhyme, Onset-Tone and Rhyme-Tone priming effects across later SOAs. The younger children (5-year-olds), in contrast, exhibited Rhyme and Rhyme-Tone priming effects at the earliest SOA. For both groups of children, Rhyme and Rhyme-Tone priming effects were complexity-dependent. Our findings suggest that the phonological representations of Mandarin speakers develop from holistic units into those with an onset-based structure. Moreover, an incremental processing pattern at the sub-syllabic level is gradually developed around the age of 9 or 10, though susceptibility to holistic phonological similarity is retained to some degree.