Acoustic measurements and perceptual evaluation of hoarseness in children's voices

McAllister, Anita; Sundberg, Johan; Hibi, Seishi

doi:10.1080/140154398434310-1

Cited by 14 publications

(2 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Objective metrics obtained using various acoustic instruments have been investigated, and attempts have been made to correlate these with perceptual voice quality assessments [8][9][10][11][12].A plethora of temporal, spectral, and cepstral metrics have been proposed to evaluate voice quality [13,14]. Commonly used features or vocal metrics include fundamental frequency ( f 0), loudness, jitter, shimmer, vocal formants, harmonic-to-noise ratio (HNR), spectral tilt (H1-H2, harmonic richness factor), maximum flow declination rate (MFDR), duty ratio, cepstral peak prominence (CPP), Mel-frequency cepstral coefficients (MFCCs), power spectrum ratio, and others [15][16][17][18][19]. Self-reported feelings of decreased vocal functionality have been used as a criterion for vocal fatigue in many previous studies [1,4,[20][21][22].…”

mentioning

confidence: 99%

Investigation of Vocal Fatigue Using a Dose-Based Vocal Loading Task

et al. 2020

View full text Add to dashboard Cite

Vocal loading tasks are often used to investigate the relationship between voice use and vocal fatigue in laboratory settings. The present study investigated the concept of a novel quantitative dose-based vocal loading task for vocal fatigue evaluation. Ten female subjects participated in the study. Voice use was monitored and quantified using an online vocal distance dose calculator during six consecutive 30-min long sessions. Voice quality was evaluated subjectively using the CAPE-V and SAVRa before, between, and after each vocal loading task session. Fatigue-indicative symptoms, such as cough, swallowing, and voice clearance, were recorded. Statistical analysis of the results showed that the overall severity, the roughness, and the strain ratings obtained from CAPE-V obeyed similar trends as the three ratings from the SAVRa. These metrics increased over the first two thirds of the sessions to reach a maximum, and then decreased slightly near the session end. Quantitative metrics obtained from surface neck accelerometer signals were found to obey similar trends. The results consistently showed that an initial adjustment of voice quality was followed by vocal saturation, supporting the effectiveness of the proposed loading task. These tools require specific vocal stimuli. For example, the CAPE-V requires the completion of three defined phonation tasks assessed through perceptual rating. This therefore limits the applicability of these tools in situations where the vocal stimuli are varied or unspecified. Many studies have investigated uncertainties in subjective judgment methodologies for voice quality evaluation. Kreiman and Gerratt investigated the source of listener disagreement in voice quality assessment using unidimensional rating scales, and found that no single metric from natural voice recordings allowed the evaluation of voice quality [6]. Kreiman also found that individual standards of voice quality, scale resolution, and voice attribute magnitude also significantly influenced intra-rater agreement [7]. Objective metrics obtained using various acoustic instruments have been investigated, and attempts have been made to correlate these with perceptual voice quality assessments [8][9][10][11][12].A plethora of temporal, spectral, and cepstral metrics have been proposed to evaluate voice quality [13,14]. Commonly used features or vocal metrics include fundamental frequency ( f 0), loudness, jitter, shimmer, vocal formants, harmonic-to-noise ratio (HNR), spectral tilt (H1-H2, harmonic richness factor), maximum flow declination rate (MFDR), duty ratio, cepstral peak prominence (CPP), Mel-frequency cepstral coefficients (MFCCs), power spectrum ratio, and others [15][16][17][18][19]. Self-reported feelings of decreased vocal functionality have been used as a criterion for vocal fatigue in many previous studies [1,4,[20][21][22]. Standard self-administered questionnaires, such as the SAVRa and the Vocal Fatigue Index (VFI), have been used to identify individuals with vocal fatigue, and to characterize their sy...

show abstract

mentioning

confidence: 99%

Investigation of Vocal Fatigue Using a Dose-Based Vocal Loading Task

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Aperiodicity in the speech signal, reflected in the features of jitter and shimmer, has been linked to perceptions of breathiness, hoarseness, and roughness (McAllister, Sundberg, & Hibi, 1998). ASD severity can be indexed using these measures, with high values and high variability of jitter being associated with more severe ASD (Bone, Lee, Black, et al, 2014).…”

Section: Speech Prosody In Asdmentioning

confidence: 99%

Engineering Innovation in Speech Science: Data and Technologies

et al. 2019

View full text Add to dashboard Cite

Purpose As increasing amounts and types of speech data become accessible, health care and technology industries increasingly demand quantitative insight into speech content. The potential for speech data to provide insight into cognitive, affective, and psychological health states and behavior crucially depends on the ability to integrate speech data into the scientific process. Current engineering methods for acquiring, analyzing, and modeling speech data present the opportunity to integrate speech data into the scientific process. Additionally, machine learning systems recognize patterns in data that can facilitate hypothesis generation, data analysis, and statistical modeling. The goals of the present article are (a) to review developments across these domains that have allowed real-time magnetic resonance imaging to shed light on aspects of atypical speech articulation; (b) in a parallel vein, to discuss how advancements in signal processing have allowed for an improved understanding of communication markers associated with autism spectrum disorder; and (c) to highlight the clinical significance and implications of the application of these technological advancements to each of these areas. Conclusion The collaboration of engineers, speech scientists, and clinicians has resulted in (a) the development of biologically inspired technology that has been proven useful for both small- and large-scale analyses, (b) a deepened practical and theoretical understanding of both typical and impaired speech production, and (c) the establishment and enhancement of diagnostic and therapeutic tools, all having far-reaching, interdisciplinary significance. Supplemental Material https://doi.org/10.23641/asha.7740191

show abstract