Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.60
|View full text |Cite
|
Sign up to set email alerts
|

Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection

Abstract: In modern interactive speech-based systems, speech is consumed and transcribed incrementally prior to having disfluencies removed. This post-processing step is crucial for producing clean transcripts and high performance on downstream tasks (e.g. machine translation). However, most current state-of-theart NLP models such as the Transformer operate non-incrementally, potentially causing unacceptable delays. We propose a streaming BERT-based sequence tagging model that, combined with a novel training objective, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 20 publications
0
0
0
Order By: Relevance
“…Particularly, Table 4 illustrates the performance of Baseline and presented models on the two datasets. The STD module achieves cutting-edge performance (Chen et al, 2022) to detect singleturn disfluencies. Also, the MTD module outperforms the Baseline on detecting multi-turn discontinuities with our proposed MultiTurnCleanup dataset.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Particularly, Table 4 illustrates the performance of Baseline and presented models on the two datasets. The STD module achieves cutting-edge performance (Chen et al, 2022) to detect singleturn disfluencies. Also, the MTD module outperforms the Baseline on detecting multi-turn discontinuities with our proposed MultiTurnCleanup dataset.…”
Section: Resultsmentioning
confidence: 99%
“…Compared with the existing disfluency detection task, which aims to detect disfluencies (e.g., selfrepairs, repetitions, restarts, and filled pauses) that commonly occur within single-turn utterances (Rocholl et al, 2021;Chen et al, 2022), the Multi-Turn Cleanup task requires identifying discontinuities both within a single turn and across multiple turns in the multi-party spoken conversational transcripts. To explicitly define the task and discontinuity taxonomy, we conducted an in-depth analysis of the Switchboard corpus 3 (Godfrey et al, 1992).…”
Section: Task Definitionmentioning
confidence: 99%
See 1 more Smart Citation
“…The restart-incremental paradigm was investigated for Transformer-based sequence labelling by Madureira and Schlangen (2020) and Kahardipraja et al (2021); recently, adaptive policies were proposed to reduce the computational load (Kaushal et al, 2023;Kahardipraja et al, 2023). Rohanian and Hough (2021) and Chen et al (2022) explored adaptation strategies to use Transformers for incremental disfluency detection. In simultaneous translation, where policies are a central concept (Zheng et al, 2020a;Zhang et al, 2020), the restartincremental approach is in use and revisions are studied (Arivazhagan et al, 2020;Sen et al, 2023).…”
Section: Related Literaturementioning
confidence: 99%