2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings
DOI: 10.1109/icassp.2006.1659968
|View full text |Cite
|
Sign up to set email alerts
|

Probabilistic Latent Prosody Analysis for Robust Speaker Verification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
5
0

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 6 publications
0
5
0
Order By: Relevance
“…To address this problem, higher level information such as the prosodic cues of a speaker, which may be less sensitive to those mismatch, are attractive recently. For example, several works [1][2][3] have shown there is a significant benefit to combining prosodic and spectral features.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…To address this problem, higher level information such as the prosodic cues of a speaker, which may be less sensitive to those mismatch, are attractive recently. For example, several works [1][2][3] have shown there is a significant benefit to combining prosodic and spectral features.…”
Section: Introductionmentioning
confidence: 99%
“…It usually explores the relationship between the per-frame mel-frequency cepstral coefficients (MFCCs) and pitch/energy contours by directly concatenating them into a single vector stream to build a single model. Score-domain fusion [1][2] is the most popular and successful strategy. But it often ignores the dependency between prosodic and spectral cues and independently establishes one system for one information source.…”
Section: Introductionmentioning
confidence: 99%
“…The most important issue for speaker verification is the channel/handset mismatch problem. To address this problem, higher level information including prosodic cues [1,2] and mode of glottal phonation (or voice-quality) [3] of a speaker, which may be less sensitive to channel/handset mismatch are attractive recently.…”
Section: Introductionmentioning
confidence: 99%
“…The prosodic information, such as the dynamic of pitch/energy contour, lengthening and pause duration, are already known to be informative and complemented with the spectral features-based speaker recognition approaches [1,2]. On the other hand, there are only few papers working on applying the voicequality, especially the normalized amplitude quotient (NAQ) [4], to speaker recognition task so far.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation