2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
DOI: 10.1109/icassp.2004.1327105
|View full text |Cite
|
Sign up to set email alerts
|

English-Chinese bilingual text-independent speaker verification

Abstract: This paper describes the development of a textindependent speaker verification (TISV) system for English and Chinese utterances. We have designed and collected a bilingual database that contains spoken responses and commands in short, medium and long durations. The TISV system uses Gaussian mixtures for speaker models. Our experiments indicate that language mismatch between enrolment and verification data leads to significant degradation in verification performance (between 40% to 49%). In order to maximize ro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…Cross-lingual speaker recognition has been in focus for researchers for some time because of the abundance of bilingual speakers in the world. Ma and Meng (2004) studied the enrollment-test mismatch and found that it caused significant performance degradation for speaker recognition. Auckenthaler et al (2001) investigated the mismatch between training and operation, within a GMM-UBM architecture, finding considerable performance degradation if the speech data used to train the Universal Background Model and the data used to validate/test speakers were in different languages.…”
Section: Related Workmentioning
confidence: 99%
“…Cross-lingual speaker recognition has been in focus for researchers for some time because of the abundance of bilingual speakers in the world. Ma and Meng (2004) studied the enrollment-test mismatch and found that it caused significant performance degradation for speaker recognition. Auckenthaler et al (2001) investigated the mismatch between training and operation, within a GMM-UBM architecture, finding considerable performance degradation if the speech data used to train the Universal Background Model and the data used to validate/test speakers were in different languages.…”
Section: Related Workmentioning
confidence: 99%
“…In (Ma and Meng, 2004), bilingual text-independent speaker recognition task was studied where each speaker is trained using English data and tested with Chinese data. In that study, it was reported that language mismatch between training and test data yields significant degradation.…”
Section: Sectionmentioning
confidence: 99%
“…In that study, it was reported that language mismatch between training and test data yields significant degradation. To alleviate this degradation, authors proposed to model each speaker using both languages (Ma and Meng, 2004). Another solution for bilingual speaker recognition is training two separate speaker models for each target speaker one with Spanish data and the other using English data (Akbacak and Hansen, 2007).…”
Section: Sectionmentioning
confidence: 99%