2012 International Conference on Systems and Informatics (ICSAI2012) 2012
DOI: 10.1109/icsai.2012.6223387
|View full text |Cite
|
Sign up to set email alerts
|

Vowel-category based Short Utterance Speaker Recognition

Abstract: The impact of Short Utterances in Speaker Recognition is of significant importance. Despite the advancements in short utterance speaker recognition (SUSR), text dependence and the role of phonemes in carrying speaker information needs further investigation. This paper presents a novel method of using vowel categories for SUSR. We define Vowel Categories (VC's) considering Chinese and English languages. After recognition and extraction of phonemes, the obtained vowels are divided into VC's, which are then used … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…A systematic combination of SUVN with LDA and source-normalised (SN)-LDA was further used effectively. Moreover, an alternative approach was introduced using PLDA to directly model the SUV which showed that for the combination of [44] performance evaluation JFA, i-vector CSS, WCCN, LDA, NAP, SDNAP, GPLDA [7] calibration evaluation linear calibration, cosine kernel, normalised cosine kernel 2012 [80] performance evaluation inclusion of short utterances in development data-set [51] performance evaluation an ad hoc fusion system of different TV spaces [81] evaluation of phoneme effects adding phonetic information, WCCN and EFR [82] adding of phonetic information VCs, UBVCM [83] adding of syllable information syllable categories, universal background syllable models 2013 [49] analysis on phoneme distribution score calibration with log duration as QMF, synthetic i-vectors [84] analysis of phonetic content TD-ASV, multiple enrolment, used speaker and phonetic content [85] analysis on confusion errors finding speaker-specific phonemes, formulate text using unique phonemes [21] analysis on score calibration QMFs, stacked scores, shared scaling, extrapolation [86] performance analysis TV, PLDA [87] source and utterance -dur. norm.…”
Section: I-vector Estimation and Normalisationmentioning
confidence: 99%
See 1 more Smart Citation
“…A systematic combination of SUVN with LDA and source-normalised (SN)-LDA was further used effectively. Moreover, an alternative approach was introduced using PLDA to directly model the SUV which showed that for the combination of [44] performance evaluation JFA, i-vector CSS, WCCN, LDA, NAP, SDNAP, GPLDA [7] calibration evaluation linear calibration, cosine kernel, normalised cosine kernel 2012 [80] performance evaluation inclusion of short utterances in development data-set [51] performance evaluation an ad hoc fusion system of different TV spaces [81] evaluation of phoneme effects adding phonetic information, WCCN and EFR [82] adding of phonetic information VCs, UBVCM [83] adding of syllable information syllable categories, universal background syllable models 2013 [49] analysis on phoneme distribution score calibration with log duration as QMF, synthetic i-vectors [84] analysis of phonetic content TD-ASV, multiple enrolment, used speaker and phonetic content [85] analysis on confusion errors finding speaker-specific phonemes, formulate text using unique phonemes [21] analysis on score calibration QMFs, stacked scores, shared scaling, extrapolation [86] performance analysis TV, PLDA [87] source and utterance -dur. norm.…”
Section: I-vector Estimation and Normalisationmentioning
confidence: 99%
“…The paper [82] introduced an approach of using vowel categories (VCs). After recognising and extracting the phonemes, extracted vowels are divided into VC's to generate universal background VC models (UBVCMs) for each VC.…”
Section: Research In Asv On Short Utterancesmentioning
confidence: 99%
“…Kanagasundaram et al proposed a source and utteranceduration normalized linear discriminant analysis approach to compensate session variability in short utterance i-vector systems [2]. Meanwhile, there are alternate approaches which resort to the text-dependant information, such as vowel-category information [3,4], multi-layer acoustic and temporal structure information [5,6], and so on.…”
Section: Introductionmentioning
confidence: 99%