Spectral dynamics of sibilant fricatives are contrastive and language specific

Reidy, Patrick

doi:10.1121/1.4964510

Cited by 29 publications

(18 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Some constraint must be in place to ensure that listeners match like with like when adapting to certain phonetic or auditory dimensions. In contrast, if the proper cue is not COG, but rather other aspects of the spectral shape including dynamic properties of the spectrum (e.g., Reidy, 2015Reidy, , 2016, it is certainly the case that the class of sibilants have more spectral similarities to one another than to the larger class of fricatives (specifically, sibilants have a strong spectral peak and an overall high amplitude). Moreover, the aspects of the spectrum that are perceptually extracted must extend beyond phonetic-specific cues given that listeners adapted in a highly comparable manner to both linguistic and nonlinguistic exposure stimuli.…”

Section: Framing Of the Present Experimentsmentioning

confidence: 99%

Acoustic–phonetic and auditory mechanisms of adaptation in the perception of sibilant fricatives

Chodroff

Wilson

2019

Atten Percept Psychophys

View full text Add to dashboard Cite

Listeners are highly proficient at adapting to contextual variation when perceiving speech. In the present study, we examined the effects of brief speech and nonspeech contexts on the perception of sibilant fricatives. We explored three theoretically motivated accounts of contextual adaptation, based on phonetic cue calibration, phonetic covariation, and auditory contrast. Under the cue calibration account, listeners adapt by estimating a talker-specific average for each phonetic cue or dimension; under the cue covariation account, listeners adapt by exploiting consistencies in how the realization of speech sounds varies across talkers; under the auditory contrast account, adaptation results from (partial) masking of spectral components that are shared by adjacent stimuli. The spectral center of gravity, a phonetic cue to fricative identity, was manipulated for several types of context sound: /z/initial syllables, /v/-initial syllables, and white noise matched in long-term average spectrum (LTAS) to the /z/-initial stimuli. Listeners' perception of the /s/-/ʃ/ contrast was significantly influenced by /z/-initial syllables and LTAS-matched white noise stimuli, but not by /v/-initial syllables. No significant difference in adaptation was observed between exposure to /z/-initial syllables and matched white noise stimuli, and speech did not have a considerable advantage over noise when the two were presented consecutively within a context. The pattern of findings is most consistent with the auditory contrast account of shortterm perceptual adaptation. The cue covariation account makes accurate predictions for speech contexts, but not for nonspeech contexts or for the absence of a speech-versus-nonspeech difference.

show abstract

Section: Framing Of the Present Experimentsmentioning

confidence: 99%

Acoustic–phonetic and auditory mechanisms of adaptation in the perception of sibilant fricatives

Chodroff

Wilson

2019

Atten Percept Psychophys

View full text Add to dashboard Cite

show abstract

“…This may deviate the temporal variation of the spectral properties (spectral dynamics) of fricative when it is produced with NAE. A state-of-the-art method to analyze the spectral dynamics of the unvoiced sibilant fricative is explored [12,17].…”

Section: Temporal Variation Of Spectral Characteristicsmentioning

confidence: 99%

“…Therefore, the frame-rate varies depending on the speech sample. Then, the DFT magnitude spectrum computed from each segment is passed through a fourth-order gammatone auditory filterbank of 361 filters [17]. Then, the spectral energy at the output of each filter is summed up (termed as "auditory excitation"), and this auditory excitation represents the spectral envelope for a speech frame of a sound unit.…”

Section: Temporal Variation Of Spectral Characteristicsmentioning

confidence: 99%

Nasal Air Emission in Sibilant Fricatives of Cleft Lip and Palate Speech

et al. 2019

View full text Add to dashboard Cite

Cleft lip and palate (CLP) is a congenital disorder of the orofacial region. Nasal air emission (NAE) in CLP speech occurs due to the presence of velopharyngeal dysfunction (VPD), and it mostly occurs in the production of fricative sounds. The objective of present work is to study the acoustic characteristics of voiceless sibilant fricatives in Kannada distorted by NAE and develop an SVM-based classification to distinguish normal fricatives from the NAE distorted fricatives. Static spectral measures, such as spectral moments are used to analyze the deviant spectral distribution of NAE distorted fricatives. As the aerodynamic parameters are deviated due to VPD, the temporal variation of spectral characteristics might also get deviated in NAE distorted fricatives. This variation is studied using the peak equivalent rectangular bandwidth (ERBN)-number, a psychoacoustic measure to analyze the temporal variation in the spectral properties of fricatives. The analysis of NAE distorted fricatives shows that the maximum spectral density is concentrated in the lower frequency range with steep positive skewness and more variations in the trajectories of peak ERBN-number as compared to the normal fricatives. The proposed SVM-based classification achieves good detection rates in discriminating NAE distorted fricatives from normal fricatives.

show abstract

“…While sibilant fricatives can be modeled synthetically as being produced by static articulatory postures, recent research indicates that the spectral properties of sibilant fricatives do vary over time (Iskarous et al, 2011; Yu, 2016) and that the spectral-kinematic properties of sibilant fricatives carry language- and consonant-specific acoustic information (Reidy, 2016). Preliminary analyses of cross-sectional age differences in how the /s/ vs. / ∫ / contrast develops in preschool-aged children acquiring American English indicate development along both static and time-varying spectral properties (Reidy, 2015).…”

Section: Acoustic Measures Of Preschool Children's Speechmentioning

confidence: 99%

“…For the databases used in the two studies by Reidy (i.e., Reidy, 2015, 2016), these multiple tokens were elicited using the picture-prompted word-repetition task described in Edwards and Beckman (2008b). This task is an efficient way to elicit a reasonably large sample of productions of a number of target sounds.…”

Section: Summary and The Road Aheadmentioning

confidence: 99%

Methods for eliciting, annotating, and analyzing databases for child speech development

Beckman

Plummer

Munson

et al. 2017

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

Methods from automatic speech recognition (ASR), such as segmentation and forced alignment, have facilitated the rapid annotation and analysis of very large adult speech databases and databases of caregiver-infant interaction, enabling advances in speech science that were unimaginable just a few decades ago. This paper centers on two main problems that must be addressed in order to have analogous resources for developing and exploiting databases of young children's speech. The first problem is to understand and appreciate the differences between adult and child speech that cause ASR models developed for adult speech to fail when applied to child speech. These differences include the fact that children's vocal tracts are smaller than those of adult males and also changing rapidly in size and shape over the course of development, leading to between-talker variability across age groups that dwarfs the between-talker differences between adult men and women. Moreover, children do not achieve fully adult-like speech motor control until they are young adults, and their vocabularies and phonological proficiency are developing as well, leading to considerably more within-talker variability as well as more between-talker variability. The second problem then is to determine what annotation schemas and analysis techniques can most usefully capture relevant aspects of this variability. Indeed, standard acoustic characterizations applied to child speech reveal that adult-centered annotation schemas fail to capture phenomena such as the emergence of covert contrasts in children's developing phonological systems, while also revealing children's nonuniform progression toward community speech norms as they acquire the phonological systems of their native languages. Both problems point to the need for more basic research into the growth and development of the articulatory system (as well as of the lexicon and phonological system) that is oriented explicitly toward the construction of age-appropriate computational models.

show abstract

Spectral dynamics of sibilant fricatives are contrastive and language specific

Cited by 29 publications

References 48 publications

Acoustic–phonetic and auditory mechanisms of adaptation in the perception of sibilant fricatives

Acoustic–phonetic and auditory mechanisms of adaptation in the perception of sibilant fricatives

Nasal Air Emission in Sibilant Fricatives of Cleft Lip and Palate Speech

Methods for eliciting, annotating, and analyzing databases for child speech development

Contact Info

Product

Resources

About