2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)
DOI: 10.1109/icassp.2001.940854
|View full text |Cite
|
Sign up to set email alerts
|

Source and system features for speaker recognition using AANN models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
44
0

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 52 publications
(44 citation statements)
references
References 8 publications
0
44
0
Order By: Relevance
“…There are methods based on the coarse spectral structure associated with different phones in the speech signal [39]. Another techniques use the excitation function (the "fine" spectral details, such as [85]). Prosodic features representing aspects of the speech that occur over timescales larger than the individual phonemes can also be used, as well as the "Mannerisms" such as particular word choice or preferred phrases, or all kinds of other high-level behavioral characteristics.…”
Section: Speech Processingmentioning
confidence: 99%
See 1 more Smart Citation
“…There are methods based on the coarse spectral structure associated with different phones in the speech signal [39]. Another techniques use the excitation function (the "fine" spectral details, such as [85]). Prosodic features representing aspects of the speech that occur over timescales larger than the individual phonemes can also be used, as well as the "Mannerisms" such as particular word choice or preferred phrases, or all kinds of other high-level behavioral characteristics.…”
Section: Speech Processingmentioning
confidence: 99%
“…Speaker transformation techniques [28,39,85,40,2,7,77,62,16] might involve modifications of different aspects of the speech signal that carries the speaker's identity. We can cite different methods.…”
Section: Speech Processingmentioning
confidence: 99%
“…There have been many efforts in the past to explore the source information that give a better SV performance when combined with the mel frequency cepstral coefficient (MFCC) feature of speech representing vocal tract aspect of speaker information. [5][6][7][8] This is due to the complementary nature of information captured by the source features in comparison to the system features. Also, as the excitation source features are less dependent on the amount of phonetic content, the duration of enrollment/testing can be less.…”
Section: Introductionmentioning
confidence: 99%
“…This reinforced the earlier findings of the literature. [5][6][7][8] Further, the improvement in performance is found to be more evident with a decrease in the duration of test data. The work on source features is extended to introduce mel power difference of spectrum in the subband (MPDSS) feature to capture the source information in a different manner.…”
Section: Introductionmentioning
confidence: 99%
“…However, experiments exist [9] where it is shown that human beings are able to recognize the identity of the speaker listening to residual signals of LP analysis. Based on this fact several authors have evaluated the usefulness of the LPC-residue and have found that although the identification rates using this kind of information alone does not perform as well as the LPderived cepstral coefficients, a combination of both can improve the results [20,12,14,22,11]. 2.…”
Section: Introductionmentioning
confidence: 99%