Numerous studies have indicated that prosodic phrase boundaries may be marked by a variety of acoustic phenomena including segmental lengthening. It has not been established, however, whether this lengthening is restricted to the immediate vicinity of the boundary, or if it extends over some larger region. In this study, segmental lengthening in the vicinity of prosodic boundaries is examined and found to be restricted to the rhyme of the syllable preceding the boundary. By using a normalized measure of segmental lengthening, and by compensating for differences in speaking rate, it is also shown that at least four distinct types of boundaries can be distinguished on the basis of this lengthening.
Prosodic structure and syntactic structure are not identical; neither are they unrelated. Knowing when and how the two correspond could yield better quality speech synthesis, could aid in the disambiguation of competing syntactic hypotheses in speech understanding, and could lead to a more comprehensive view of human speech processing. In a set of experiments involving 35 pairs of phonetically similar sentences representing seven types of structural contrasts, the perceptual evidence shows that some, but not all, of the pairs can be disambiguated on the basis of prosodic differences. The phonological evidence relates the disambiguation primarily to boundary phenomena, although prominences sometimes play a role. Finally, phonetic analyses describing the attributes of these phonological markers indicate the importance of both absolute and relative measures.
SRI International is currently involved in the development of a new generation of software systems for automatic scoring of pronunciation as part of the Voice Interactive Language Training System (VILTS) project. This paper describes the goals of the VILTS system, the speech corpus, and the algorithm development. The automatic grading system uses SRI's Decipher™ continuous speech recognition system [1] to generate phonetic segmentations that are used to produce pronunciation scores at the end of each lesson. The scores produced by the system are similar to those of expert human listeners. Unlike previous approaches in which models were built for specific sentences or phrases, we present a new family of algorithms designed to perform well even when knowledge of the exact text to be used is not available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.