As one of the major Chinese dialects, Cantonese has a tone system consisting
of nine lexical tones and three additional changed tones, which is considerably
more complex than that of Mandarin. The most important acoustic feature characterizing
these tones is the contour of the voice fundamental frequency (the F₀ contour).
In this article we present an approach to modeling F₀ contours of Cantonese
utterances, based on an extension of the command-response model. Analysis-bysynthesis
of F₀ contours of the utterances with a fixed carrier frame, in which a target
syllable with each tone type is embedded, shows that each tone type can be
represented by a specific pattern (polarity, timing, and amplitude) of tone commands.
These patterns are found to be essentially maintained in F₀ contours of the
utterances with unconstrained text. With the definition of these tone command
patterns, the command-response model not only provides a novel phonological
description of tones, but also gives high accuracy of approximations to F₀ contours
of Cantonese utterances and allows one to analyze various tonal phenomena
in quantitative terms. Quantitative distinctions between various tones are then
revealed by statistical analysis of the timing and amplitude of tone commands.
Especially, systematic alignment in timing is found between the onsets/offsets of
tone commands and the rhyme of a syllable, and hence a set of constraints can be
introduced, which together with those on tone command amplitudes and phrase
command parameters, is then applied for generating F₀ contours of Cantonese
utterances. The validity of the approach is verified by perceptual evaluation of the
synthetic speech stimuli with model-generated F₀ contours, both on the intelligibility
of tones and on the naturalness of prosody.