“…Speech contains information that is usually obtained by processing a speech signal captured by a microphone using sampling, quantization, coding [ 38 ], parametrization, preprocessing, segmentation, centring, pre-emphasis, and window weighting [ 39 , 40 ]. The next step is speech recognition with - statistical approach for continuous speech recognition [ 41 ] with different approaches [ 42 ] for speech recognition system’s [ 43 ] using the perceptual linear prediction (PLP) of speech [ 44 ], for example,
- Audio-to-Visual Conversion in Mpeg-4 [ 45 ],
- acoustic modeling and feature extraction [ 46 ],
- speech activity detectors [ 47 ] or joint training of hybrid neural networks for acoustic modeling in automatic speech recognition [ 48 ],
- the RASTA method (RelAtive SpecTrAl) [ 38 ], and
- the Mel-frequency cepstral analysis (MFCC), for example,
- dimensionality reduction of a pathological voice quality assessment system [ 49 ],
- content-based clinical depression detection in adolescents [ 50 ],
- speech recognition in an intelligent wheelchair [ 51 ],
- speech recognition by using the from speech signals of spoken words [ 52 ],
- the hidden Markov models (HMM) [ 53 ], and
- artificial neural networks (ANN) [ 54 ], for example,
- feed-forward Neural Network (NN) with back propagation algorithm and a Radial Basis Functions Neural Networks [ 55 ],
- an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients [ 56 ],
- fast adaptation of deep neural network based on discriminant codes for speech recognition [ 57 ],
- implementation of dnn-hmm acoustic models for phoneme recognition [ 58 ],
- combination of features in a hybrid HMM/MLP and a HMM/GMM speech recognition system [ 59 ], and
- hybrid continuous speech recognition systems by HMM, MLP and SVM [ ...
…”