Boosting bonsai trees for efficient features combination: application to speaker role identification

Laurent, Antoine; Camelin, Nathalie; Raymond, Christian

doi:10.21437/interspeech.2014-16

Cited by 10 publications

(3 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We use a glass-box classifier called Bonzaiboost 1 [8] based on boosting [14] where a set of weak classifiers made of small decision trees on the features of GF Ambiguity # of semantic labels acceptable for W # of Part-Of-Speech (POS) acceptable for W + POS label # of possible syntactic dependency for W + dependency label distance between W and the sentence syntactic root. utterance length (in words) % of words in S belonging to a concept Coverage # of occurrences of W in train # of occurrences of (W, l) in train is bigrams (W − 1,W ) and (W,W + 1) occurring in train?…”

Section: Analyzing Complexity Factorsmentioning

confidence: 99%

Can We Predict How Challenging Spoken Language Understanding Corpora Are Across Sources, Languages, and Domains?

Béchet

Raymond

Hamane

et al. 2022

Lecture Notes in Electrical Engineering

Self Cite

View full text Add to dashboard Cite

State-of-the art Spoken Language Understanding models of Spoken Dialog Systems achieve remarkable results on benchmark corpora thanks to the winning combination of pretraining on large collection of out-of-domain data with contextual Transformer representations and fine tuning on in-domain data. On average, performances are almost perfect on benchmark datasets such ATIS. However some phenomena can affect greatly these performance, like unseen events or ambiguities. They are the major sources of errors in real-life deployed systems although they are not necessarily equally represented in benchmark corpora. This paper aims to predict and characterize error-prone utterances and to explain what makes a given corpus more or less challenging. After training such a predictor on benchmark corpora from various languages and domains, we confront it to a new corpus collected from a French deployed vocal assistant with different distributional properties. We show that the predictor can highlight challenging utterances and explain the main complexity factors even though this corpus was collected in a completely different setting.

show abstract

Section: Analyzing Complexity Factorsmentioning

confidence: 99%

Can We Predict How Challenging Spoken Language Understanding Corpora Are Across Sources, Languages, and Domains?

Béchet

Raymond

Hamane

et al. 2022

Lecture Notes in Electrical Engineering

Self Cite

View full text Add to dashboard Cite

show abstract

“…In a preliminary experiment, we will evaluate these features for quality assessment in ASR only (W CE ASR task). Two different classifiers will be used: a variant of boosting classification algorithm called bonzaiboost [14] (implementing the boosting algorithm Adaboost.MH over deeper trees) and the Conditional Random Fields [12].…”

Section: Wce Features For Speech Transcription (Asr)mentioning

confidence: 99%

“…This allowed us to observe that WCE performance decreases as ASR system improves. For reproducible research, most features14 and algorithms used in this paper are available through our toolkit called WCE-LIG. This package is made available on a GitHub repository 15 under the licence GPL V3.…”

mentioning

confidence: 99%

Automatic quality estimation for speech translation using joint ASR and MT features

Lecouteux

Besacier

2018

Machine Translation

View full text Add to dashboard Cite

This paper addresses automatic quality assessment of spoken language translation (SLT). This relatively new task is defined and formalized as a sequence labeling problem where each word in the SLT hypothesis is tagged as good or bad according to a large feature set. We propose several word confidence estimators (WCE) based on our automatic evaluation of transcription (ASR) quality, translation (MT) quality, or both (combined ASR+MT). This research work is possible because we built a specific corpus which contains 6.7k utterances for which a quintuplet containing: ASR output, verbatim transcript, text translation, speech translation and post-edition of translation is built. The conclusion of our multiple experiments using joint ASR and MT features for WCE is that MT features remain the most influent while ASR feature can bring interesting complementary information. Our robust quality estimators for SLT can be used for re-scoring speech translation graphs or for providing feedback to the user in interactive speech translation or computer-assisted speech-to-text scenarios.

show abstract

Group Behavior Recognition

Wang

2020

Human Behavior Analysis: Sensing and Understanding

View full text Add to dashboard Cite

Boosting bonsai trees for efficient features combination: application to speaker role identification

Cited by 10 publications

References 11 publications

Can We Predict How Challenging Spoken Language Understanding Corpora Are Across Sources, Languages, and Domains?

Can We Predict How Challenging Spoken Language Understanding Corpora Are Across Sources, Languages, and Domains?

Automatic quality estimation for speech translation using joint ASR and MT features

Group Behavior Recognition

Contact Info

Product

Resources

About