2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013
DOI: 10.1109/icassp.2013.6639170
|View full text |Cite
|
Sign up to set email alerts
|

Where are the challenges in speaker diarization?

Abstract: We present a study on the contributions to Diarization Error Rate by the various components of speaker diarization system. Following on from an earlier study by Huijbregts and Wooters, we extend into more areas and draw somewhat different conclusions. From a series of experiments combining real, oracle and ideal system components, we are able to conclude that the primary cause of error in diarization is the training of speaker models on impure data, something that is in fact done in every current system. We co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 17 publications
(9 citation statements)
references
References 10 publications
0
9
0
Order By: Relevance
“…They report a diarization error rate of 27.6%. While fully automated diarization procedures are appealing, diarization error rates can substantially be improved when introducing a learning step into the procedure, based on a small quantity of pure data (Sinclair and King, 2013). This relates to the idea of supervised machine learning.…”
Section: Speaker Diarization In Psychotherapy Researchmentioning
confidence: 99%
“…They report a diarization error rate of 27.6%. While fully automated diarization procedures are appealing, diarization error rates can substantially be improved when introducing a learning step into the procedure, based on a small quantity of pure data (Sinclair and King, 2013). This relates to the idea of supervised machine learning.…”
Section: Speaker Diarization In Psychotherapy Researchmentioning
confidence: 99%
“…A proper stopping criterion in diarization is crucial as it affects the final diarization error rate (DER) [7]. Stopping the clustering process at the number of clusters less than the actual number of speakers is termed as over-clustering, and the opposite is referred to as under-clustering [1].…”
Section: B Stopping Criterion For Tpib-lda and Vartpib-lda Systemsmentioning
confidence: 99%
“…In general, an approach to unsupervised diarization of a conversational speech includes speech segment initialization followed by the bottom-up agglomerative clustering of the segments [1]. The major challenges in building an unsupervised speaker diarization system lie in the initialization of segments for clustering, to obtain speaker discriminative features, deciding on the number of speakers, and detection of overlapped speaker segments [1], [7]. A good segment initialization and speaker discriminative features help to improve the performance of the diarization system.…”
Section: Introductionmentioning
confidence: 99%
“…To extract Malay speech from the found data, we applied a speaker diarisation technique (Sinclair and King, 2013) as it is able to identify speaker homogenous regions throughout the speech data. First, feature extraction is performed on the data using HTK with the standard ASR features.…”
Section: Diarisationmentioning
confidence: 99%