Peyman Goli scite author profile

Speech intelligibility improvement is an important task to increase human perception in telecommunication systems and hearing aids when the speech is degraded by the background noises. Although, deep neural network (DNN) based learning architectures which use mean square error (MSE) as the cost function has been found to be very successful in speech enhancement areas, they typically attempt to enhance the speech quality by uniformly optimizing the separation of a target speech signal from a noisy observation over all frequency bands. In this work, we propose a new cost function which further focuses on speech intelligibility improvement based on a psychoacoustic model. The band-importance function, which is a principal component of speech intelligibility index (SII), has been used to determine the relative contribution to speech intelligibility provided by each frequency band in learning algorithm. In addition, we augment a signal to noise ratio (SNR) estimation to the network to improve the generalization of the method to unseen noisy conditions. The performance of the proposed MSE cost function is compared with the conventional MSE cost function in the same conditions. Our approach shows better performance in objective speech intelligibility measures such as coherence SII (CSII) and short-time objective intelligibility (STOI), while mitigating quality scores in perceptual evaluation of speech quality (PESQ) and speech distortion (SD) measure.

show abstract

New Method Boosts Speech Intelligibility in Noisy Environments

Goli¹,

Raofy²

2017

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Peyman Goli

Advantages of Deep Learning for ECoG-based Speech Recognition

A new perceptually weighted cost function in deep neural network based speech enhancement systems

New Method Boosts Speech Intelligibility in Noisy Environments

Contact Info

Product

Resources

About