Four experiments investigated the acoustical correlates of similarity and categorization judgments of environmental sounds. In Experiment 1, similarity ratings were obtained from pairwise comparisons of recordings of 50 environmental sounds. A three-dimensional multidimensional scaling (MDS) solution showed three distinct clusterings of the sounds, which included harmonic sounds, discrete impact sounds, and continuous sounds. Furthermore, sounds from similar sources tended to be in close proximity to each other in the MDS space. The orderings of the sounds on the individual dimensions of the solution were well predicted by linear combinations of acoustic variables, such as harmonicity, amount of silence, and modulation depth. The orderings of sounds also correlated significantly with MDS solutions for similarity ratings of imagined sounds and for imagined sources of sounds, obtained in Experiments 2 and 3--as was the case for free categorization of the 50 sounds (Experiment 4)--although the categorization data were less well predicted by acoustic features than were the similarity data.
Three experiments tested listeners' ability to identify 70 diverse environmental sounds using limited spectral information. Experiment 1 employed low- and high-pass filtered sounds with filter cutoffs ranging from 300 to 8000 Hz. Listeners were quite good (>50% correct) at identifying the sounds even when severely filtered; for the high-pass filters, performance was never below 70%. Experiment 2 used octave-wide bandpass filtered sounds with center frequencies from 212 to 6788 Hz and found that performance with the higher bandpass filters was from 70%-80% correct, whereas with the lower filters listeners achieved 30%-50% correct. To examine the contribution of temporal factors, in experiment 3 vocoder methods were used to create event-modulated noises (EMN) which had extremely limited spectral information. About half of the 70 EMN were identifiable on the basis of the temporal patterning. Multiple regression analysis suggested that some acoustic features listeners may use to identify EMN include envelope shape, periodicity, and the consistency of temporal changes across frequency channels. Identification performance with high- and low-pass filtered environmental sounds varied in a manner similar to that of speech sounds, except that there seemed to be somewhat more information in the higher frequencies for the environmental sounds used in this experiment.
Performance on 19 auditory discrimination and identification tasks was measured for 340 listeners with normal hearing. Test stimuli included single tones, sequences of tones, amplitude-modulated and rippled noise, temporal gaps, speech, and environmental sounds. Principal components analysis and structural equation modeling of the data support the existence of a general auditory ability and four specific auditory abilities. The specific abilities are (1) loudness and duration (overall energy) discrimination; (2) sensitivity to temporal envelope variation; (3) identification of highly familiar sounds (speech and nonspeech); and (4) discrimination of unfamiliar simple and complex spectral and temporal patterns. Examination of Scholastic Aptitude Test (SAT) scores for a large subset of the population revealed little or no association between general or specific auditory abilities and general intellectual ability. The findings provide a basis for research to further specify the nature of the auditory abilities. Of particular interest are results suggestive of a familiar sound recognition (FSR) ability, apparently specialized for sound recognition on the basis of limited or distorted information. This FSR ability is independent of normal variation in both spectral-temporal acuity and of general intellectual ability.
This study was designed to address individual differences in aided speech understanding among a relatively large group of older adults. The group of older adults consisted of 98 adults (50 female and 48 male) ranging in age from 60 to 86 (mean = 69.2). Hearing loss was typical for this age group and about 90% had not worn hearing aids. All subjects completed a battery of tests, including cognitive (6 measures), psychophysical (17 measures), and speech-understanding (9 measures), as well as the Speech, Spatial, and Qualities of Hearing (SSQ) self-report scale. Most of the speech-understanding measures made use of competing speech and the non-speech psychophysical measures were designed to tap phenomena thought to be relevant for the perception of speech in competing speech (e.g., stream segregation, modulation-detection interference). All measures of speech understanding were administered with spectral shaping applied to the speech stimuli to fully restore audibility through at least 4000 Hz. The measures used were demonstrated to be reliable in older adults and, when compared to a reference group of 28 young normal-hearing adults, age-group differences were observed on many of the measures. Principal-components factor analysis was applied successfully to reduce the number of independent and dependent (speech understanding) measures for a multiple-regression analysis. Doing so yielded one global cognitive-processing factor and five non-speech psychoacoustic factors (hearing loss, dichotic signal detection, multi-burst masking, stream segregation, and modulation detection) as potential predictors. To this set of six potential predictor variables were added subject age, Environmental Sound Identification (ESI), and performance on the text-recognition-threshold (TRT) task (a visual analog of interrupted speech recognition). These variables were used to successfully predict one global aided speech-understanding factor, accounting for about 60% of the variance.
Melodic and rhythmic context were systematically varied in a pattern recognition task involving pairs (standard-eomparison) of nine-tone auditory sequences. The experiment was designed to test the hypothesis that rhythmic context can direct attention toward or away from tones which instantiate higher order melodic rules. Three levels of melodic structure (one,two, no higher order rules) were crossed with four levels of rhythm [isochronous, dactyl (A U U), anapest (U U A), irregular]. Rhythms were designed to shift accent locations on three centrally embedded tones. Listeners were more accurate in detecting violations of higher order melodic rules when the rhythmic context induced accents on tones which instantiated these rules. Effects are discussed in terms of attentional rhythmicity.211
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.