2018
DOI: 10.1121/1.5064463
|View full text |Cite
|
Sign up to set email alerts
|

Intelligibility assessment of cleft lip and palate speech using Gaussian posteriograms based on joint spectro-temporal features

Abstract: Intelligibility is considered as one of the primary measures for speech rehabilitation of individuals with a cleft lip and palate (CLP). Currently, speech processing and machine-learning-based objective methods are gaining more research interest as a way to quantify speech intelligibility. In this work, joint spectro-temporal features computed from a time–frequency representation of speech are explored to derive speech representations based on Gaussian posteriograms. A comparative framework using dynamic time … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 35 publications
0
3
0
Order By: Relevance
“…support vector machine (SVM), Gaussian mixture model (GMM)) that map the features to clinical articulation scores. Most of these ML models [18], [19] are trained on a set of acoustic features extracted from a particular disorder. However, as we see in Fig.…”
Section: A Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…support vector machine (SVM), Gaussian mixture model (GMM)) that map the features to clinical articulation scores. Most of these ML models [18], [19] are trained on a set of acoustic features extracted from a particular disorder. However, as we see in Fig.…”
Section: A Related Workmentioning
confidence: 99%
“…As an alternative to traditional features based on signal processing, joint spectro-temporal features have been proposed to model CV transition regions in machine learning models. The two-dimensional discrete cosine transform (2D-DCT) is one of the most commonly used approaches to modeling the spectrotemporal dynamics of CV transition regions [19], [22]- [24]. Supervised learning methods using 2D-DCT features as input have been used for classification of place of articulation in stop consonants [23], to evaluate the goodness of /t/ and /k/ productions in children with speech sound disorders [25], to detect stop consonant production errors [24], and to model perceptual intelligibility ratings in children with CP [19].…”
Section: A Related Workmentioning
confidence: 99%
“…The obvious shortcomings of the method of assessing the quality of speech restoration, based on GOST R 58040-95 [4], led to the need to develop algorithms for automatic evaluation and their implementation in the framework of automated systems for assessing speech intelligibility. At the moment, there are a number of works on automation of speech intelligibility assessment [5][6][7], but all of them appeared later than the beginning of development and the appearance of the main algorithms of the considered software complex. As part of the research on speech restoration using technical methods, such algorithms were developed and implemented.…”
Section: Introductionmentioning
confidence: 99%