2020
DOI: 10.48550/arxiv.2011.06195
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Semi-Supervised Semantics Understanding from Speech

Abstract: Much recent work on Spoken Language Understanding (SLU) falls short in at least one of three ways: models were trained on oracle text input and neglected the Automatics Speech Recognition (ASR) outputs, models were trained to predict only intents without the slot values, or models were trained on a large amount of inhouse data. We proposed a clean and general framework to learn semantics directly from speech with semi-supervision from transcribed speech to address these. Our framework is built upon pretrained … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 72 publications
(135 reference statements)
0
5
0
Order By: Relevance
“…It shows the best performance when speech data itself are used as input, but this end-to-end method requires a lot of computation. The proposed method shows comparable performance to Lai's approach [13] through a simple spoken language representation. As shown in Table 5, the average accuracy of our proposed model achieved the highest performance in the Audio-Snips dataset.…”
Section: Comparisons To Other Studiesmentioning
confidence: 88%
See 4 more Smart Citations
“…It shows the best performance when speech data itself are used as input, but this end-to-end method requires a lot of computation. The proposed method shows comparable performance to Lai's approach [13] through a simple spoken language representation. As shown in Table 5, the average accuracy of our proposed model achieved the highest performance in the Audio-Snips dataset.…”
Section: Comparisons To Other Studiesmentioning
confidence: 88%
“…Chung et al utilized a masking policy approach for the SLU task to jointly pre-train the unpaired speech and text via aligning representations [25]. SpeechBERT was trained via a semi-supervised method, not only representation learning, but also for the intent classification and slot-filling [13]. This model is tested for its demonstration of robustness against ASR errors and extraction of semantic meaning in the input sequence.…”
Section: Asr-slu-based Intent Classificationmentioning
confidence: 99%
See 3 more Smart Citations