Cochlear implants provide users with limited spectral and temporal information. In this study, the amount of spectral and temporal information was systematically varied through simulations of cochlear implant processors using a noise-excited vocoder. Spectral information was controlled by varying the number of channels between 1 and 16, and temporal information was controlled by varying the lowpass cutoff frequencies of the envelope extractors from 1 to 512 Hz. Consonants and vowels processed using those conditions were presented to seven normal-hearing native-Englishspeaking listeners for identification. The results demonstrated that both spectral and temporal cues were important for consonant and vowel recognition with the spectral cues having a greater effect than the temporal cues for the ranges of numbers of channels and lowpass cutoff frequencies tested. The lowpass cutoff for asymptotic performance in consonant and vowel recognition was 16 and 4 Hz, respectively. The number of channels at which performance plateaued for consonants and vowels was 8 and 12, respectively. Within the above-mentioned ranges of lowpass cutoff frequency and number of channels, the temporal and spectral cues showed a tradeoff for phoneme recognition. Information transfer analyses showed different relative contributions of spectral and temporal cues in the perception of various phonetic/acoustic features.
By conventional spike count measures, auditory neurons in the cat's anterior ectosylvian sulcus cortical area are broadly tuned for the location of a sound source. Nevertheless, an artificial neural network was trained to classify the temporal spike patterns of single neurons according to sound location. The spike patterns of 73 percent of single neurons coded sound location with more than twice the chance level of accuracy, and spike patterns consistently carried more information than spike counts alone. In contrast to neurons that are sharply tuned for location, these neurons appear to encode sound locations throughout 3600 of azimuth.
We evaluated two hypothetical codes for sound-source location in the auditory cortex. The topographical code assumed that single neurons are selective for particular locations and that sound-source locations are coded by the cortical location of small populations of maximally activated neurons. The distributed code assumed that the responses of individual neurons can carry information about locations throughout 360 degrees of azimuth and that accurate sound localization derives from information that is distributed across large populations of such panoramic neurons. We recorded from single units in the anterior ectosylvian sulcus area (area AES) and in area A2 of alpha-chloralose-anesthetized cats. Results obtained in the two areas were essentially equivalent. Noise bursts were presented from loudspeakers spaced in 20 degrees intervals of azimuth throughout 360 degrees of the horizontal plane. Spike counts of the majority of units were modulated >50% by changes in sound-source azimuth. Nevertheless, sound-source locations that produced greater than half-maximal spike counts often spanned >180 degrees of azimuth. The spatial selectivity of units tended to broaden and, often, to shift in azimuth as sound pressure levels (SPLs) were increased to a moderate level. We sometimes saw systematic changes in spatial tuning along segments of electrode tracks as long as 1.5 mm but such progressions were not evident at higher sound levels. Moderate-level sounds presented anywhere in the contralateral hemifield produced greater than half-maximal activation of nearly all units. These results are not consistent with the hypothesis of a topographic code. We used an artificial-neural-network algorithm to recognize spike patterns and, thereby, infer the locations of sound sources. Network input consisted of spike density functions formed by averages of responses to eight stimulus repetitions. Information carried in the responses of single units permitted reasonable estimates of sound-source locations throughout 360 degrees of azimuth. The most accurate units exhibited median errors in localization of <25 degrees, meaning that the network output fell within 25 degrees of the correct location on half of the trials. Spike patterns tended to vary with stimulus SPL, but level-invariant features of patterns permitted estimates of locations of sound sources that varied through 20-dB ranges. Sound localization based on spike patterns that preserved details of spike timing consistently was more accurate than localization based on spike counts alone. These results support the hypothesis that sound-source locations are represented by a distributed code and that individual neurons are, in effect, panoramic localizers.
Tone languages differ from English in that the pitch pattern of a single-syllable word conveys lexical meaning. In the present study, dependence of tonal-speech perception on features of the stimulation was examined using an acoustic simulation of a CIS-type speech-processing strategy for cochlear prostheses. Contributions of spectral features of the speech signals were assessed by varying the number of filter bands, while contributions of temporal envelope features were assessed by varying the low-pass cutoff frequency used for extracting the amplitude envelopes. Ten normal-hearing native Mandarin Chinese speakers were tested. When the low-pass cutoff frequency was fixed at 512 Hz, consonant, vowel, and sentence recognition improved as a function of the number of channels and reached plateau at 4 to 6 channels. Subjective judgments of sound quality continued to improve as the number of channels increased to 12, the highest number tested. Tone recognition, i.e., recognition of the four Mandarin tone patterns, depended on both the number of channels and the low-pass cutoff frequency. The trade-off between the temporal and spectral cues for tone recognition indicates that temporal cues can compensate for diminished spectral cues for tone recognition and vice versa. An additional tone recognition experiment using syllables of equal duration showed a marked decrease in performance, indicating that duration cues contribute to tone recognition. A third experiment showed that recognition of processed FM patterns that mimic Mandarin tone patterns was poor when temporal envelope and duration cues were removed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.