A forensic-phonetic speaker identification experiment is described which tests to what extent same-speaker pairs from a 50 speakerJapanese data base can be discriminated from different-speaker pairs using a Bayesian likelihood ratio (LR) as discriminant function. Non-contemporaneous telephone recordings are used, with comparison based on mean values from three segments only: a nasal, a voiceless fricative, and a vowel. It is shown that discrimination using the LR-based distance is better than with a conventional distance, and that the cepstrum outperforms the formants. A LR for the test of 50 is obtained for formant-based discrimination, compared to c. 900 for the cepstrum, and the tests are thus shown to be capable of yielding a probative strength of support for the prosecution hypothesis that is conveniionnlly quantified as 'moderate' for formants but 'moderately strong' for the cepstrum. Comparisons are made with results from similar experiments. KEf-STORDS forensic speaker identification, strength of evidence, formanrs, cepstrum, Bayes' theorem, likelihood ratio tions with GMM-based sysrems" Proceedings of the 2001 speaker ody ssey speaker Recognition-workshop, International speech communication Association, 135-8.
This study was designed to evaluate whether or not previously proposed acoustic measures of vowel nasality are applicable for speaker comparison in a forensic context. Three acoustic parameters were selected and analysed for vowels in nasal and oral phonetic environments: the amplitude difference (in dB) between the first formant and the extra peak caused by nasalisation (A1-P1), and the frequencies (in Hz) of the first formant (F1) and extra peak (Fp1). We analysed eighteen monosyllables and six isolated words uttered by fifty male speakers and recorded through a microphone. Recordings were conducted twice for each speaker at a two to five month interval. Between-and within-speaker variations were examined using the F-ratio and by conducting regression analysis between two recording sessions, respectively. Results revealed that Fp1 of front vowels yielded large F-ratio values, which means high speaker-discriminating power and that A1-P1 of the vowels in oral contexts showed within-speaker stability over time.
This paper presents a text-independent speaker verification method using Gaussian mixture models (GMMs), where only utterances of enrolled speakers are required. Artificial cohorts are used instead of those from speaker databases, and GMMs for artificial cohorts are generated by changing model parameters of the GMM for a claimed speaker. Equal error rates by the proposed method are about 60% less than those by a conventional method which also uses only utterances of enrolled speakers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.