2014
DOI: 10.2478/ecce-2014-0009
|View full text |Cite
|
Sign up to set email alerts
|

Improving Speech Recognition Rate through Analysis Parameters

Abstract: -Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these variations are not very rapid and the signal can be considered as stationary in short segments. It is presumed that in short-time magnitude spectrum the most distinct information of speech is contained. This is the main reason for speech signal analysis in frame-by-frame manner. The analyzed speech signal is segmented into overlapping segments (so-called frames) for this purpose. Segments of 15-25 ms with the overl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 33 publications
0
5
0
Order By: Relevance
“…The first step for improving the recognition rate is to adjust the frame width and increment during feature extraction. This improvement has already been shown (Paliwal et al, 2010. Eringis & Tamulevicius, 2014 given that these parameters determine whether the modeling algorithms are getting a sufficient level of information from the speech segment as input.…”
Section: Tuning the Parameters For Feature Extractionmentioning
confidence: 79%
See 1 more Smart Citation
“…The first step for improving the recognition rate is to adjust the frame width and increment during feature extraction. This improvement has already been shown (Paliwal et al, 2010. Eringis & Tamulevicius, 2014 given that these parameters determine whether the modeling algorithms are getting a sufficient level of information from the speech segment as input.…”
Section: Tuning the Parameters For Feature Extractionmentioning
confidence: 79%
“…Similar conclusions were deducted (Gulzar et al 2014;Dave, 2013), but instead of adjusting the frame width, the effect of changing the set of features entirely was studied. In some other work (Eringis and Tamulevicius, 2014) it was shown that by adjusting the frame width and increment for different features, an improvement of 4.15% (from 88.75% to 92.9%) can be achieved.…”
Section: Introductionmentioning
confidence: 98%
“…The formant extraction method is only reported in study A18. However, formant values may differ depending on the extraction method (e.g., linear predictive coding, Fast Fourier Transform, cepstral analysis), and on extraction/analysis parameters (such as window type, frame size, time step and parameters specific to each method) (Derdemezis et al, 2016;Eringis & Tamulevičius, 2014). Hence, this lack of information does not allow for the replication of the study's methodology, nor for comparative analyses.…”
Section: Consonant Measuresmentioning
confidence: 99%
“…Moreover, the phase information is estimated during the training that leads to better prediction of the clean signal, and also no scaling is needed for the output signal. However, working in the time domain results in a much higher number of network parameters due to the large frame size used, which is proved to be better than smaller frames [12], [26]. This larger number of parameters increases the size of the model, and restricts its applicability in some real time implementations as the model may not fit into the hardware [27].…”
Section: Literature Reviewmentioning
confidence: 99%