Multiple factors influence voice quality measurements (VQM) obtained during an acoustic voice assessment including: gender, intrasubject variability, microphone, environmental noise (type and level), data acquisition (DA) system, and analysis software. This study used regression trees to investigate the order and relative importance of these factors on VQM including interaction effects of the factors and how the outcome differs when the acoustic environment is controlled for noise. Twenty normophonic participants provided 20 voice samples each, which were recorded synchronously on five DA systems combined with six different microphones. The samples were mixed with five noise types at eight signal-to-noise ratio (SNR) levels. The resulting 80,000 audio samples were analyzed for fundamental frequency (F₀), jitter and shimmer using three software analysis systems: MDVP, PRAAT, and TF32 (CSpeech). Fifteen regression trees and their Variable Importance Measures were utilized to analyze the data. The analyses confirmed that all of the factors listed above were influential. The results suggest that gender, intrasubject variability, and microphone were significant influences on F₀. Software systems and gender were highly influential on measurements of jitter and shimmer. Environmental noise was shown to be the prominent factor that affects VQM when SNR levels are below 30 dB.
It is universally recognized that sampling rate (F(S)) influences the reliability and validity of acoustic voice measurements; however, an exact relationship has not been determined. The purpose of this experiment was to investigate the influence of F(S) on acoustic voice quality measurements, while considering the influences of gender, intra-subject variability, microphone, environmental noise, data acquisition hardware, and analysis software as balancing factors. The impact of F(S), from 44.1 kHz to 10 kHz, was explored by analyzing 864,000 measures of fundamental frequency, jitter, and shimmer, using three software analysis systems: MDVP, TF32, and PRAAT. Results suggest that the recommended, acceptable, and critical F(S) for acoustic voice analysis are above 26 kHz, above 19 kHz, and 12 kHz, respectively. Thus, voice samples captured above 26 kHz can be used for data analysis and compared without introducing error due to F(S).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.