2019
DOI: 10.1109/access.2019.2918147
|View full text |Cite
|
Sign up to set email alerts
|

Fusion Feature Extraction Based on Auditory and Energy for Noise-Robust Speech Recognition

Abstract: Environmental noise can pose a threat to the stable operation of current speech recognition systems. It is therefore essential to develop a front feature set that is able to identify speech under low signalto-noise ratio. In this paper, a robust fusion feature is proposed that can fully characterize speech information. To obtain the cochlear filter cepstral coefficients (CFCC), a novel feature is first extracted by the power-law nonlinear function, which can simulate the auditory characteristics of the human e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(8 citation statements)
references
References 18 publications
0
8
0
Order By: Relevance
“…π‘šπ‘š 𝑖𝑖,𝑗𝑗 = 𝑀𝑀𝑖𝑖𝑒𝑒(𝐷𝐷 (𝑖𝑖,1) , 𝐷𝐷 (𝑖𝑖,2) , … … 𝐷𝐷 (𝑖𝑖,𝑗𝑗) ) (19) where: D is the similarity based on DTW. Calculate the weight (w) for each set of K nearest neighbors using Equation 16.…”
Section: Classificationmentioning
confidence: 99%
See 1 more Smart Citation
“…π‘šπ‘š 𝑖𝑖,𝑗𝑗 = 𝑀𝑀𝑖𝑖𝑒𝑒(𝐷𝐷 (𝑖𝑖,1) , 𝐷𝐷 (𝑖𝑖,2) , … … 𝐷𝐷 (𝑖𝑖,𝑗𝑗) ) (19) where: D is the similarity based on DTW. Calculate the weight (w) for each set of K nearest neighbors using Equation 16.…”
Section: Classificationmentioning
confidence: 99%
“…She et al [19], created a new feature extraction technique using the supplied blended features. The combination uses the cepstral coefficients as a foundation of the cochlear filter to maximize accuracy in noisy surroundings (CFCC).…”
Section: Introductionmentioning
confidence: 99%
“…Energy-based methods are widely used in speech analysis [23][24]. The characteristic frequency of bearing can also be observed based on energy [25][26][27], such as Teager energy and short-time energy.…”
Section: Introductionmentioning
confidence: 99%
“…In the front-end processing, the Mel frequency cepstral coefficient (MFCC) is widely used to represent the speech signal [1]. Besides, the perceptual linear predictive (PLP) features [2], spectro-temporal features [3], and cochlear filter cepstral coefficients (CFCC) features [4] have also been successfully used for speech recognition. In the backend classification, the statistical acoustic models are commonly used, such as hidden Markov model (HMM) [5], *Correspondence: wupingping@nau.edu.cn 2 School of Engineering Auditing, Jiangsu Key Laboratory of Public Project Audit, Nanjing Audit University, Nanjing, China Full list of author information is available at the end of the article artificial neural network (ANN) [6], and dynamic Bayesian network (DBN) [7].…”
Section: Introductionmentioning
confidence: 99%
“…Fig 4. Performance comparison of the proposed algorithm (IGMM20) and original GMM-based feature compensation (GMM400 and GMM20) with different SNRs for the three types of testing noise…”
mentioning
confidence: 99%