Proceedings of the 2010 International Workshop on Searching Spontaneous Conversational Speech 2010
DOI: 10.1145/1878101.1878104
|View full text |Cite
|
Sign up to set email alerts
|

Speaker role recognition to help spontaneous conversational speech detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…In the literature, SRR methods have been studied in purpose of chaptering audio-visual documents (talk shows and broadcast news). Existing methods are divided according to the features extracted (audio and/or text), the decision level (for each speaker turn [2,3] or globally on all the turns of a given speaker [4,5,6,7,8]) and classification techniques (supervised [4,5,2,3] or unsupervised [6,7,8]).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In the literature, SRR methods have been studied in purpose of chaptering audio-visual documents (talk shows and broadcast news). Existing methods are divided according to the features extracted (audio and/or text), the decision level (for each speaker turn [2,3] or globally on all the turns of a given speaker [4,5,6,7,8]) and classification techniques (supervised [4,5,2,3] or unsupervised [6,7,8]).…”
Section: Related Workmentioning
confidence: 99%
“…A loss of 1.1% in accuracy is shown when automatic features are extracted compared to the use of manually labelled linguistic phenomena. In [5], authors used temporal, acoustic and prosodic features to classify roles at the speaker cluster level. Authors distinguish between punctual and non punctual speaker roles and train Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) classifiers hierarchically.…”
Section: Related Workmentioning
confidence: 99%
“…Speaker Role Recognition (SRR) is the task of assigning a specific role to each speaker turn (speaker-homogeneous segment) in a speech signal. This task plays a significant role in numerous areas, such as information retrieval [1], audio indexing [2], or social interaction analysis [3]. Most of the research efforts have been focused on identifying roles in broadcast news programs or talk shows [4][5][6][7] , while there have been also works dealing with meeting scenarios [8], conferences [9], medical discussions between domain experts [10], and psychotherapy sessions [11].…”
Section: Introductionmentioning
confidence: 99%
“…In the case of speaker-level SRR (Figure 1b), the classifier is built in two steps, the first being a Speaker Clustering (SC) algorithm, or a diarization system in the more general case, where the turns are grouped into same-speaker clusters in an unsupervised way and then each cluster is assigned a specific role. In this line of work, [16] uses a social network analysis approach taking into consideration relational data across different speakers, while a hierarchical classification system is proposed in [2] and [12]. The effect of various modalities on the final performance of SRR when using boosting algorithms is investigated in [17].…”
Section: Introductionmentioning
confidence: 99%
“…In the literature, the problem is seen as a multiclass classification problem where each speaker of a show has to be associated with a role label. In this way, some previous studies have tackled the problem using machine learning from mainly lexical features extracted from the transcription [1,2,3], from acoustic / prosodic features [4,5], or from a combination of lexical and acoustic features [6]. Those studies have highlighted the efficiency of a boosting algorithm over decision stumps to combine the various features.…”
Section: Introductionmentioning
confidence: 99%