2009
DOI: 10.1007/978-3-540-93905-4_41
|View full text |Cite
|
Sign up to set email alerts
|

Artificial Neural Networks in the Disabled Speech Analysis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 14 publications
(12 citation statements)
references
References 16 publications
0
12
0
Order By: Relevance
“…Early studies in disfluency detection that appeared to report positive results [15,16] lacked complete statistical findings; therefore, their significance cannot be determined. Other studies presented stuttering events in isolated speech segments to an artificial neural network (ANN), such that the ANN was actually performing a classification task, rather than recognition in continuous speech [17,18]. Further studies used a hybrid approach to detect stuttering in childrens reading tasks: Heeman et al [19,20] merged ASR outputs with the clinician's own manual annotations to produce corrected transcripts of the stuttering speech; this approach could not be described as fully automatic.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Early studies in disfluency detection that appeared to report positive results [15,16] lacked complete statistical findings; therefore, their significance cannot be determined. Other studies presented stuttering events in isolated speech segments to an artificial neural network (ANN), such that the ANN was actually performing a classification task, rather than recognition in continuous speech [17,18]. Further studies used a hybrid approach to detect stuttering in childrens reading tasks: Heeman et al [19,20] merged ASR outputs with the clinician's own manual annotations to produce corrected transcripts of the stuttering speech; this approach could not be described as fully automatic.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The number of PLP and LPC coefficients depends on the order of the LPC. A different number of MFCCs (13,15,17, and 24) is used to characterize the disfluent speech. In this work, a 10-fold cross-validation scheme is used to prove the reliability of the classification results.…”
Section: Resultsmentioning
confidence: 99%
“…The various feature extraction methods that have been explored in the stuttering recognition systems are autocorrelation function and envelope parameters [78], duration, energy peaks, spectral of word based and part word based [79][80][81], age, sex, type of disfluency, frequency of disfluency, duration, physical concomitant, rate of speech, historical, attitudinal and behavioral scores, family history [38], duration and frequency of disfluent portions, speaking rate [26], frequency, 1 𝑠𝑡 to 3 𝑟𝑑 formant's frequencies and its amplitudes [81,82], spectral measure (fast Fourier transform (FFT) 512) [83,84], mel frequency cepstral coefficients (MFCC) [81,[85][86][87], Linear Predictive Cepstral Coefficients (LPCCs) [81,86], pitch, shimmer [88], zero crossing rate (ZCR) [81], short time average magnitude, spectral spread [81], linear predictive coefficients (LPC), weighted linear prediction cepstral coefficients (WLPCC) [86], maximum autocorrelation value (MACV) [81], linear prediction-Hilbert transform based MFCC (LH-MFCC) [89], noise to harmonic ratio, shimmer harmonic to noise ratio , harmonicity, amplitude perturbation quotient, formants and its variants (min, max, mean, median, mode, std), spectrum centroid [88], Kohonen's self-organizing Maps [84], i-vectors [90], perceptual linear predictive (PLP) [87], respiratory biosignals [39], and sample entropy feature [91]. With the recent developments in convolutional neural networks, the feature representation of stuttered speech is moving towards spectrogram representations from conventional MFCCs.…”
Section: Statistical Approachesmentioning
confidence: 99%