As features unique to spoken language, speech prosody plays an important role in human communication. Although the acoustic features of speech are viewed most frequently in a frame-byframe manner, this is not always appropriate for prosodic features, since they are tightly related to higher level linguistic information, such as syntactic and discourse structures, and spread to wide time spans, such as syllables, words, and phrases. In order to handle the situation, models for prosody have been developed. Among many models, the generation process model of fundamental frequency contours is attractive, since it can relate well to the linguistic information of utterances. The model was successfully applied to hidden Markov model (HMM) based speech synthesis and a listening test to determine the (perceptual) categorical boundaries of Japanese accent types.