Silvio P. Eberhardt scite author profile

Speech perception under audiovisual (AV) conditions is well known to confer benefits to perception such as increased speed and accuracy. Here, we investigated how AV training might benefit or impede auditory perceptual learning of speech degraded by vocoding. In Experiments 1 and 3, participants learned paired associations between vocoded spoken nonsense words and nonsense pictures. In Experiment 1, paired-associates (PA) AV training of one group of participants was compared with audio-only (AO) training of another group. When tested under AO conditions, the AV-trained group was significantly more accurate than the AO-trained group. In addition, pre- and post-training AO forced-choice consonant identification with untrained nonsense words showed that AV-trained participants had learned significantly more than AO participants. The pattern of results pointed to their having learned at the level of the auditory phonetic features of the vocoded stimuli. Experiment 2, a no-training control with testing and re-testing on the AO consonant identification, showed that the controls were as accurate as the AO-trained participants in Experiment 1 but less accurate than the AV-trained participants. In Experiment 3, PA training alternated AV and AO conditions on a list-by-list basis within participants, and training was to criterion (92% correct). PA training with AO stimuli was reliably more effective than training with AV stimuli. We explain these discrepant results in terms of the so-called “reverse hierarchy theory” of perceptual learning and in terms of the diverse multisensory and unisensory processing resources available to speech perception. We propose that early AV speech integration can potentially impede auditory perceptual learning; but visual top-down access to relevant auditory features can promote auditory perceptual learning.

show abstract

Single-channel vibrotactile supplements to visual perception of intonation and stress

Bernstein

Eberhardt

Demorest

1989

View full text Add to dashboard Cite

Two experiments were conducted to explore the effectiveness of a single vibrotactile stimulator to convey intonation (question versus statement) and contrastive stress (on one of the first three words of four 4- or 5-word sentences). In experiment I, artificially deafened normal-hearing subjects judged stress and intonation in counterbalanced visual-alone and visual-tactile conditions. Six voice fundamental frequency-to-tactile transformations were tested. Two sentence types were voiced throughout, and two contained unvoiced consonants. Benefits to speechreading were significant, but small. No differences among transformations were observed. In experiment II, only the tactile stimuli were presented. Significant differences emerged among the transformations, with larger differences for intonation than for stress judgments. Surprisingly, tactile-alone intonation identification was more accurate than visual-tactile for several transformations.

show abstract

Audiovisual spoken word training can promote or impede auditory-only perceptual learning: prelingually deafened adults with late-acquired cochlear implants versus normal hearing adults

2014

View full text Add to dashboard Cite

Training with audiovisual (AV) speech has been shown to promote auditory perceptual learning of vocoded acoustic speech by adults with normal hearing. In Experiment 1, we investigated whether AV speech promotes auditory-only (AO) perceptual learning in prelingually deafened adults with late-acquired cochlear implants. Participants were assigned to learn associations between spoken disyllabic C(=consonant)V(=vowel)CVC non-sense words and non-sense pictures (fribbles), under AV and then AO (AV-AO; or counter-balanced AO then AV, AO-AV, during Periods 1 then 2) training conditions. After training on each list of paired-associates (PA), testing was carried out AO. Across all training, AO PA test scores improved (7.2 percentage points) as did identification of consonants in new untrained CVCVC stimuli (3.5 percentage points). However, there was evidence that AV training impeded immediate AO perceptual learning: During Period-1, training scores across AV and AO conditions were not different, but AO test scores were dramatically lower in the AV-trained participants. During Period-2 AO training, the AV-AO participants obtained significantly higher AO test scores, demonstrating their ability to learn the auditory speech. Across both orders of training, whenever training was AV, AO test scores were significantly lower than training scores. Experiment 2 repeated the procedures with vocoded speech and 43 normal-hearing adults. Following AV training, their AO test scores were as high as or higher than following AO training. Also, their CVCVC identification scores patterned differently than those of the cochlear implant users. In Experiment 1, initial consonants were most accurate, and in Experiment 2, medial consonants were most accurate. We suggest that our results are consistent with a multisensory reverse hierarchy theory, which predicts that, whenever possible, perceivers carry out perceptual tasks immediately based on the experience and biases they bring to the task. We point out that while AV training could be an impediment to immediate unisensory perceptual learning in cochlear implant patients, it was also associated with higher scores during training.

show abstract

Speechreading sentences with single-channel vibrotactile presentation of voice fundamental frequency

Eberhardt

Bernstein

Demorest

et al. 1990

View full text Add to dashboard Cite

The main goal of this study was to investigate the efficacy of four vibrotactile speechreading supplements. Three supplements provided single-channel encodings of fundamental frequency (F0). Two encodings involved scaling and shifting glottal pulses to pulse rate ranges suited to tactual sensing capabilities; the third transformed F0 to differential amplitude of two fixed-frequency sinewaves. The fourth supplement added to one of the F0 encodings a second vibrator indicating high-frequency speech energy. A second goal was to develop improved methods for experimental control. Therefore, a sentence corpus was recorded on videodisc using two talkers whose speech was captured by video, microphone, and electroglottograph. Other experimental control issues included use of visual-alone control subjects, a multiple-baseline, single-subject design replicated for each of 15 normal-hearing subjects, sentence and syllable pre- and post-tests balanced for difficulty, and a speechreading screening test for subject selection. Across 17 h of treatment and 5 h of visual-alone baseline testing, each subject performed open-set sentence identification. Covariance analyses showed that the single-channel supplements provided a small but significant benefit, whereas the two-channel supplement was not effective. All subjects improved in visual-alone speechreading and maintained individual differences across the experiment. Vibrotactile benefit did not depend on speechreading ability.

show abstract

A computational approach to analyzing sentential speech perception: Phoneme-to-phoneme stimulus–response alignment

Bernstein

Demorest

Eberhardt

1994

View full text Add to dashboard Cite

A solution to the following problem is presented: Obtain a principled approach to studying error patterns in sentence-length responses obtained from subjects who were instructed to simply report what a talker had said. The solution is a sequence comparator that performs phoneme-to-phoneme alignment on transcribed stimulus and response sentences. Data for developing and testing the sequence comparator were obtained from 139 normal-hearing subjects who lipread (speechread) 100 sentences and from 15 different subjects who identified nonsense syllables by lipreading. Development of the sequence comparator involved testing two different costs metrics (visemes versus Euclidean distances) and two related comparison algorithms. After alignments with face validity were achieved, a validation experiment was conducted for which measures from random versus true stimulus-response sentence pairs were compared. Measures of phonemes correct and substitution uncertainty were found to be sensitive to the nature of the sentence pairs. In particular, correct phoneme matches were extremely rare in random pairings in comparison with true pairs. Also, an information-theoretic measure of uncertainty for substitutions in true versus random pairings showed that uncertainty was always higher for random than for true pairs.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.