2016
DOI: 10.1016/j.csl.2016.03.005
|View full text |Cite
|
Sign up to set email alerts
|

A study of speaker clustering for speaker attribution in large telephone conversation datasets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(13 citation statements)
references
References 29 publications
0
11
0
Order By: Relevance
“…Similarity scores for all possible speaker pairs (i.e, the reference speakers with known speaker identities and the speakers with pseudo labels assigned by the SD system) are calculated. Based on these speaker similarity scores represented in the form of a distance matrix, speaker linking is performed by applying complete-linkage clustering as described in [39]. A mild level of linking is performed, as aggressive merging of the speaker labels conceivably has an inverse effect on the ASR performance, which has been also observed in the pilot experiments.…”
Section: Speaker Linkingmentioning
confidence: 95%
“…Similarity scores for all possible speaker pairs (i.e, the reference speakers with known speaker identities and the speakers with pseudo labels assigned by the SD system) are calculated. Based on these speaker similarity scores represented in the form of a distance matrix, speaker linking is performed by applying complete-linkage clustering as described in [39]. A mild level of linking is performed, as aggressive merging of the speaker labels conceivably has an inverse effect on the ASR performance, which has been also observed in the pilot experiments.…”
Section: Speaker Linkingmentioning
confidence: 95%
“…A similarity matrix for all possible speaker pairs are calculated using a PLDA model [23] to find the cross-tape speaker similarities. Based on these speaker similarity scores, speaker linking is performed by applying completelinkage clustering as described in [7]. Varying levels of speaker linking can be performed by manipulating the clustering threshold results in different levels of speaker and cluster impurities.…”
Section: Second Stage: Speaker Linking and Identificationmentioning
confidence: 99%
“…We have considered the conventional bottom-up AHC clustering system with the options of single and average linkages. We did not consider the model retraining approach because it is costly in terms of computations as compared to the linkage approaches to clustering [14]. The system starts with initial number of clusters equal to the total number of speaker segments.…”
Section: Speaker Clusteringmentioning
confidence: 99%
“…On the other hand, utterances from same speakers must not be distributed among multiple clusters. Several approaches to speaker clustering task exist, for example cost optimization, sequential and Agglomerative Hierarchical Clustering [12,13,14,15]. Some approaches rely on commonly used statistical speaker modeling like Gaussian Mixture Models (GMMs) while others use features extracted using Deep Neural Networks (DNNs).…”
Section: Introductionmentioning
confidence: 99%