Implementation of audio recognition using mel frequency cepstrum coefficient and dynamic time warping in wirama praharsini

Wibawa, I. D. G. Y. A.; Darmawan, I Dewa Made Bayu Atmaja

doi:10.1088/1742-6596/1722/1/012014

Cited by 9 publications

(3 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The main extraction algorithms are fast Fourier transform (FFT), Mel filter, logarithmic operation, and discrete cosine transform (DCT). MFCC feature parameters will be used as input to the speech recognition model [24,25]. The speech signal preprocessing is implemented by a firstorder FIR high-pass digital filter in the MATLAB system digital filter toolbox.…”

Section: Algorithm Designmentioning

confidence: 99%

[Retracted] Innovative Application of Sensor Combined with Speech Recognition Technology in College English Education in the Context of Artificial Intelligence

Guo

2023

Journal of Sensors

View full text Add to dashboard Cite

English listening is an effective way to improve students’ English expression ability and use oral communication. However, from the current situation of English teaching, the current English teaching methods are too single, and teachers do not focus on oral training in the classroom, resulting in low efficiency of classroom teaching. On the basis of following the principles of wholeness, interaction, balance, and sustainable development of educational ecology, by enhancing the synergy of ecological elements of English speaking classroom, promoting interactive dialogue among ecological subjects, and regulating classroom behaviors, it is conducive to giving full play to the advantageous role of information technology on English speaking teaching reform and promoting its sustainable development. This paper addresses the current situation of English listening teaching, especially the problem of reduced recognition rate of spoken language in noisy environment, and the principle of using dual-sensor speech recognition system proposed. We design the speech recognition method based on recurrent neural network by acquiring the weak vibration pressure speech signal of the jaw skin and the speech signal transmitted through the air during the vocalization process through the sensor. Deep machine learning algorithm is used for speech recognition in English teaching. A reasonable frame sampling frequency is set to obtain the English speech signal, then the feature parameters representing this speech signal are obtained by linear prediction coefficients, and the speech feature vector is generated, followed by the recurrent neural network algorithm to train the speech features. In the related experiments, by comparing with the commonly used speech recognition algorithms, it is proved that the proposed algorithm English teaching speech recognition has higher accuracy and faster convergence.

show abstract

Section: Algorithm Designmentioning

confidence: 99%

[Retracted] Innovative Application of Sensor Combined with Speech Recognition Technology in College English Education in the Context of Artificial Intelligence

Guo

2023

Journal of Sensors

View full text Add to dashboard Cite

show abstract

“…where ( ) denotes the target audio signal, w(n) represents the window function, and sgn is the sign function defined by Equation (11). When x(m) has the same sign as x(m-1), sgn ensures that their difference is zero.…”

Section: Zero Crossing Ratementioning

confidence: 99%

Wavelet-based denoising for wind turbine blades damage detections using audio signals

Chen,

Zhang,

Zhuang

et al. 2023

Second International Conference on Energy, Power, and Electrical Technology (ICEPET 2023)

View full text Add to dashboard Cite

Currently, the detection of wind turbine blade damage mainly relies on regular plan-based maintenance and manual inspections. In this study, a method for extracting audio features and detecting damage in wind turbine blades with wavelet denoising is proposed. This method first uses wavelet denoising to process the original audio signal, the denoised audio is then split into frames with Hamming windowing function. After that, multi-scale features are extracted in both time and frequency domains. Principal component analysis is used to reduce the dimensionality of the features, and clustering canters are obtained through K-means clustering analysis. Finally, Gaussian distribution outlier detection is used to detect audio signals from damaged blades. Experimental results using lab-generated audio data show that the proposed method has high accuracy and strong robustness in detecting wind turbine blade damage.

show abstract

“…The feature vectors were extracted from the digital signals of the input speech in the format of MFCCs. (4,5) MFCCs were chosen because they are based on the perceptual characteristics of the human auditory system. (6,7) A block diagram of the MFCC feature extraction process is shown in Fig.…”

Section: Feature Extractionmentioning

confidence: 99%

Recognition System for Cantonese Speakers in Different Noisy Environments Based on Estimate–Maximize Algorithm

Yang¹,

Chen²,

Yang³

2022

Sensors and Materials

View full text Add to dashboard Cite

Highly accurate personal identification systems are required in many different recognition situations. In this study, the Mel-frequency cepstrum coefficient was used to extract the features of speakers. The aim of this study was to identify different speeches in different noisy environments. A maximum likelihood estimation method based on noise probability was proposed to enhance the recognition effects of the Gaussian mixture model of different speeches from mixed noise speech signals. Experimental results indicated that the method had high recognition results under various noise conditions. The recognition results of the proposed method in different noise environments were superior to those of a method using only one type of noise for modeling. Experimental results obtained from some unspecified speakers showed that three different languages (Mandarin, English, and Cantonese) were effectively identified.

show abstract

Implementation of audio recognition using mel frequency cepstrum coefficient and dynamic time warping in wirama praharsini

Cited by 9 publications

References 2 publications

[Retracted] Innovative Application of Sensor Combined with Speech Recognition Technology in College English Education in the Context of Artificial Intelligence

[Retracted] Innovative Application of Sensor Combined with Speech Recognition Technology in College English Education in the Context of Artificial Intelligence

Wavelet-based denoising for wind turbine blades damage detections using audio signals

Recognition System for Cantonese Speakers in Different Noisy Environments Based on Estimate–Maximize Algorithm

Contact Info

Product

Resources

About