Consonant durations as single sounds and in double and triple clusters, both in intervocalic and boundary posiwords uttered five times in sentence frames by a pro-tions, have been measured on spectrograms of 100 nonsense fessional speaker. Inherent durations, retrieved in single unstressed condition, appear consistently related to manner of articulation. Consonant durations in double clusters undergo to general lengthening in intervocalic and general shortening in initial positions, whereas lower effects were acknowledged in both cases by widening the clusters. Finally, variations have been expressed by modification coefficients of the inherent durations, obtaining rules which approximate measured values within a range lower than their Standard Deviation.
The paper aims at illustrating the phonetic transcription rules implemented in a text-to-speech synthesis system of Italian. The development system employed for rule writing is presented first. The system makes it possible to process linguistic information at various levels, using a conventional context-dependent format. The rule set, made up of graphemic, phonetic and word boundary rules, is then described. Transcription rules furnish a reasonable solution to [s, z], [ts, dz], [‘e, ‘ε], [‘o, ‘o], [i, j, i] and [u, w, i] phone distributions, as used in Northern Italian pronunciation. Statistical evaluation of the transcriber performance was carried out on a significant corpus of words with encouraging results: 93.1 % correct transcriptions.
Traditional grammars consider the presence of one segment marked [+high] and the presence of stress (lexical accent) on the contiguous [—high] vowel as the main factors responsible for diphthong realization in Italian; yet, they report that diphthongs also occur in unstressed contiguous vowels. These statements are considered here as the hypotheses to be investigated. Assuming that the gliding process of diphthongization involves shortening, durations of Italian vowels and vowel-like sounds in candidate diphthongs and nondiphthongs (i.e., vowel clusters) occurring in both stressed and unstressed sequences within word boundaries were measured in two different speech materials: (1) nonsense words embedded in sentence frames; (2) meaningful words in sentences. in both contexts, the test words contained pairs of contiguous vowels to be pronounced as a diphthong or as a vowel cluster, according to recognized Italian phonotactic possibilities. Results show the following: (1) in both corpora, segmental duration differences do not effectively characterize the realization of diphthongs and vowel clusters in stressed sequences, while in unstressed sequences there is a strong duration difference between a diphthong and the corresponding vowel cluster, but only in the onglide ([+high/-high]) case. (2) Vowel clusters appear to be articulated differently in meaningful sentences than in nonsense utterances, even when the speaker manages to maintain the same nominal speaking rate. These results will be applied in the text-to-speech synthesis of Italian which is now being developed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.