This paper presents results of investigations concerning the role of syllable as the main unit for automatic speech recognition and speech synthesis for Polish. Research reveals that the acoustic properties of the particular syllable make it easy detectable in the speech signal. The artificial neural networks were used for classification tasks. Also auditory test showed that use of syllables is good solution for speech synthesis. The article presents parts of author`s doctor thesis: 'Acoustic-phonetic analysis of syllable in Polish for use in speech technology'.
This paper presents a module forming part of a currently developed syllabification system for Polish. The main purpose of this module is to map the orthographic form of consonant clusters located within words to their phonological transcription. The mapping is very precise -each letter in the orthographic form is assigned to a particular phoneme or group of phonemes. The main goal is to allow phonological principles -the sonority principle and the principle of maximal onset -to be applied directly to words written in orthographic form. These principles are based on a phonological sonority scale that assigns abstract numbers to speech sounds. These numbers
This paper discusses the assumptions of a Multi-Layer Transcription Model (hereinafter: MLTM). The solution presented is an advanced grapheme-to-phoneme (G2P) conversion method that can be implemented in technical applications, such as automatic speech recognition and synthesis systems. The features of MLTM also facilitate the application of text-to-transcription conversion in linguistic research. The model presented here is the basis for multi-step processing of the orthographic representation of words with those being transcribed gradually. The consecutive stages of the procedure include, among other things, identification of multi-character phonemes, voicing status change, and consonant clusters simplification. The multi-layer model described in this paper makes it possible to assign individual phonetic processes (for example assimilation), as well as other types of transformation, to particular layers. As a result, the set of rules becomes more transparent. Moreover, the rules related to any process can be modified independently of the rules connected with other forms of transformation, provided that the latter have been assigned to a different layer. These properties of the multi-layer transcription model in question provide crucial advantages for the solutions based on it, such as their flexibility and transparency. There are no assumptions in the model about the applicable number of layers, their functions, or the number of rules defined in each layer. A special mechanism used for the implementation of the MLTM concept enables projection of individual characters onto either a phonemic or a phonetic transcript (obtained after processing in the final layer of the MLTM-based system has been completed). The solution presented in this text has been implemented for the Polish language, however, it is not impossible to use the same model for other languages.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.