<p>ElectrodeNet, a deep-learning based sound coding strategy for the cochlear implant (CI), is proposed in this study. </p> <p>ElectrodeNet emulates the ACE strategy by replacing the conventional envelope detection using various artificial neural networks, and the extended ElectrodeNet-CS strategy further incorporates the channel selection (CS) in the network. Network models of deep neural network (DNN), convolutional neural network (CNN), and long short-term memory (LSTM) were trained using the Fast Fourier Transformed bins and electrode stimulation patterns from the processing of the ACE strategy for clean speech. Objective speech understanding using short-time objective intelligibility (STOI) and normalized covariance metric (NCM) was estimated for ElectrodeNet with the factors of network architecture, dataset language, and noise type using CI simulations. Subjective listening tests for vocoded Mandarin speech were conducted with normal-hearing listeners to measure sentence recognition scores. DNN, CNN, and LSTM based ElectrodeNets exhibited strong correlations to ACE in objective and subjective scores using mean squared error (MSE), linear correlation coefficient (LCC) and Spearman’s rank correlation coefficient (SRCC). The ElectrodeNet-CS strategy was capable of producing N-of-M compatible electrode patterns using a modified DNN network to embed maxima selection, and to perform in similar or even slightly above average in STOI and sentence recognition compared to ACE. The methods and findings in this study demonstrated the feasibility and potential of using deep learning in CI sound coding strategy.</p>
<p>ElectrodeNet, a deep learning based sound coding strategy for the cochlear implant (CI), is proposed to emulate the advanced combination encoder (ACE) strategy by replacing the conventional envelope detection using various artificial neural networks. The extended ElectrodeNet-CS strategy further incorporates the channel selection (CS). Network models of deep neural network (DNN), convolutional neural network (CNN), and long short-term memory (LSTM) were trained using the Fast Fourier Transformed bins and channel envelopes obtained from the processing of clean speech by the ACE strategy. Objective speech understanding using short-time objective intelligibility (STOI) and normalized covariance metric (NCM) was estimated for ElectrodeNet using CI simulations. Sentence recognition tests for vocoded Mandarin speech were conducted with normal-hearing listeners. DNN, CNN, and LSTM based ElectrodeNets exhibited strong correlations to ACE in objective and subjective scores using mean squared error (MSE), linear correlation coefficient (LCC) and Spearman’s rank correlation coefficient (SRCC). The ElectrodeNet-CS strategy was capable of producing N-of-M compatible electrode patterns using a modified DNN network to embed maxima selection, and to perform in similar or even slightly higher average in STOI and sentence recognition compared to ACE. The methods and findings demonstrated the feasibility and potential of using deep learning in CI coding strategy.</p>
<div> <div> <div> <div> <p>Objective: ElectrodeNet, a deep-learning based sound coding strategy for the cochlear implant (CI), is proposed in this study. The performance between ElectrodeNet and the advanced combination encoder (ACE) coding strategy in speech intelligibility is compared. Methods: ElectrodeNet emulates the ACE strategy and replaces the conventional envelope detection using various forms of artificial neural networks. Network models of deep neural network (DNN), convolutional neural network (CNN), and long short-term memory (LSTM) were trained using the fast Fourier transformed clean speech and the corresponding electrode stimulation patterns. Objective speech intelligibility was estimated for ElectrodeNets for the factors of loss function, network architecture, language, and noise type. Subjective listening tests for vocoded Mandarin speech were conducted with 40 normal-hearing listeners. Results: DNN, CNN, and LSTM based ElectrodeNets exhibited strong correlations with the ACE strategy in short-time objective intelligibility (STOI) and normalized covariance metric (NCM) scores. For objective evaluations, small mean squared error (MSE) scores between ACE and ElectrodeNets were less than 0.01 under all experimental conditions, whereas linear correlation coefficient (LCC) and Spearman’s rank correlation coefficient (SRCC) were obtained in large values greater than 0.97 and 0.96, respectively. According to the listening test results, substantial positive relationships were also observed between ACE and both DNN and CNN based ElectrodeNets with MSEs smaller than 0.02, and LCCs and SRCCs greater than 0.9. Significance: This study demonstrates the feasibility of using deep learning to encode sound into meaningful patterns for CI listening. </p> </div> </div> </div> </div>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.