Yanzhang Geng scite author profile

At present, pathological voice recognition is mainly based on the classification of pathological voice. However, almost all the researches are based on the single vowel \a\ samples, but few on multivowels. In addition, the current researches on multi-vowels recognition are mainly for normal voices, which are unsuitable for the speech recognition of normal and pathological multi-vowels simultaneously. This paper concentrates on developing an accurate and robust feature called enhanced-bark line spectrum pair (E-BLSP) to detect and classify normal and pathological multi-vowels. We explore the impact of E-BLSP feature on recognition performance and propose an effective method based on the combination of three features including E-BLSP for pathological and normal multi-vowels. In this paper, first LSP and difference of adjacent LSP (DAL) features of a vowel are extracted. Then LSP feature is warped at bark domain to get bark line spectrum pair (BLSP). In addition, then E-BLSP feature is calculated by adjusting BLSP using DAL feature. Finally, the adjusted E-BLSP feature and other two traditional features, including linear prediction cepstrum coefficient (LPCC) and mel-frequency cepstrum coefficients (MFCC) are applied to support vector machine (SVM) and deep neural network (DNN) classifiers to explore the classification performance of single feature and feature combinations for pathological and normal vowels /a/, /i/ and /u/. The results show that the highest achieved accuracies for DNN and SVM network are 98.6190% and 96.2693%, while the largest achieved area under curves (AUC) are 0.9925 and 0.9868, correspondingly with the combination of three features including LPCC, MFCC, and E-BLSP.

show abstract

A speech separation algorithm based on the comb-filter effect

Zhang

Wang

Geng

et al. 2023

Applied Acoustics

View full text Add to dashboard Cite

A Unified Speech Enhancement System Based on Neural Beamforming With Parabolic Reflector

et al. 2020

View full text Add to dashboard Cite

This paper presents a unified speech enhancement system to remove both background noise and interfering speech in serious noise environments by jointly utilizing the parabolic reflector model and neural beamformer. First, the amplification property of paraboloid is discussed, which significantly improves the Signal-to-Noise Ratio (SNR) of a desired signal. Therefore, an appropriate paraboloid channel is analyzed and designed through the boundary element method. On the other hand, a time-frequency masking approach and a mask-based beamforming approach are discussed and incorporated in an enhancement system. It is worth noticing that signals provided by the paraboloid and the beamformer are exactly complementary. Finally, these signals are employed in a learning-based fusion framework to further improve the system performance in low SNR environments. Experiments demonstrate that our system is effective and robust in five different noisy conditions (speech interfered with factory, pink, destroyer engine, volvo, and babble noise), as well as in different noise levels. Compared with the original noisy speech, significant average objective metrics improvements are about Δ STOI = 0.28, Δ PESQ = 1.31, Δ fwSegSNR = 11.9.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yanzhang Geng

An overview of speech endpoint detection algorithms

A Pathological Multi-Vowels Recognition Algorithm Based on LSP Feature

A speech separation algorithm based on the comb-filter effect

A Unified Speech Enhancement System Based on Neural Beamforming With Parabolic Reflector

Contact Info

Product

Resources

About