2014 XIX Symposium on Image, Signal Processing and Artificial Vision 2014
DOI: 10.1109/stsiva.2014.7010187
|View full text |Cite
|
Sign up to set email alerts
|

Pattern recognition of hypernasality in voice of patients with Cleft and Lip Palate

Abstract: The Cleft and Lip Palate (CLP) is a malformation with high recurrence in Colombia, which affects the ability of the phonation system, making difficult the effective communication of the patient. This research seeks to find patterns that enable to detect hypernasality without using invasive diagnostic methods. We performed an analysis of a large range of acoustic features to identify those capable of discriminating hypernasality. The analyzed features include: Teager energy operator (TEO), linear predictive cod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0
2

Year Published

2016
2016
2020
2020

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 6 publications
0
2
0
2
Order By: Relevance
“…Mel-frequency cepstral coefficients (MFCCs) and other spectral transformations [37], [38], [36], [39], [40], [41], [42], [43], [44], glottal source related features (jitter and shimmer) [45], [46], difference between the low-pass and bandpass profile of the Teager Energy Operator (TEO) [47], [48], and non-linear features [49], [50] have all been proposed as model input features. Gaussian mixture models (GMM), support vector machines, and deep neural networks have been used in conjunction with these features for hypernasality evaluation from word and sentence level data [51], [52], [53], [54]. Recently, end-to-end neural networks taking MFCC frames as input and producing hypernasality assessments as output have also been proposed [55].…”
Section: A Related Workmentioning
confidence: 99%
“…Mel-frequency cepstral coefficients (MFCCs) and other spectral transformations [37], [38], [36], [39], [40], [41], [42], [43], [44], glottal source related features (jitter and shimmer) [45], [46], difference between the low-pass and bandpass profile of the Teager Energy Operator (TEO) [47], [48], and non-linear features [49], [50] have all been proposed as model input features. Gaussian mixture models (GMM), support vector machines, and deep neural networks have been used in conjunction with these features for hypernasality evaluation from word and sentence level data [51], [52], [53], [54]. Recently, end-to-end neural networks taking MFCC frames as input and producing hypernasality assessments as output have also been proposed [55].…”
Section: A Related Workmentioning
confidence: 99%
“…En la Figura 1 se muestra un esquema general de clasificación de audio con AP, donde la entrada de la red neuronal es una representación espectral del audio, en este caso un espectrograma calculado con la transformada de Fourier de término reducido (STFT por su sigla en inglés) sobre el audio multiplicado previamente por una ventana (windowing). En el contexto de este proyecto, se busca que la parte de la red que realiza la extracción de características transmita a las capas posteriores información relativa a ciertas medicas acústicas que se saben relacionadas con la calidad vocal, como shimmer, jitter y harmonics-to-noise ratio (HNR) [5][6][7].…”
Section: Materiales Y Métodosunclassified
“…Shimmer value is associated to voice quality [2][3][4][5][6][7], state of mind [8][9][10][11][12][13], age [14] and gender [15] of people. There are many research works that use shimmer (among other measures) with goals ranging from pathologies detection [6,16,17] to the improvement of human-machine interfaces through the estimation of the intensionality of a spoken phrase [19].…”
Section: Introductionmentioning
confidence: 99%