2021
DOI: 10.1016/j.csl.2020.101142
|View full text |Cite
|
Sign up to set email alerts
|

Low resource end-to-end spoken language understanding with capsule networks

Abstract: Designing a Spoken Language Understanding (SLU) system for command-and-control applications is challenging. Both Automatic Speech Recognition and Natural Language Understanding are language and application dependent to a great extent. Even with a lot of design effort, users often still have to know what to say to the system for it to do what they want. We propose to use an end-to-end SLU system that maps speech directly to semantics and that can be trained by the user through demonstrations. The user can teach… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 20 publications
0
5
0
Order By: Relevance
“…The authors showed that the end-to-end speech recognition system with capsule networks on one-second speech commands dataset achieves better results on both clean and noise-added tests than baseline convolutional neural network models. A capsule network for low resource spoken language understanding was proposed for commandand-control applications in [5]. For small quantities of data, the proposed model is shown to significantly outperform the previous state-of-the-art model.…”
Section: Related Workmentioning
confidence: 99%
“…The authors showed that the end-to-end speech recognition system with capsule networks on one-second speech commands dataset achieves better results on both clean and noise-added tests than baseline convolutional neural network models. A capsule network for low resource spoken language understanding was proposed for commandand-control applications in [5]. For small quantities of data, the proposed model is shown to significantly outperform the previous state-of-the-art model.…”
Section: Related Workmentioning
confidence: 99%
“…According to the survey results, the shortage of publicly available continuous speech corpora justifies additional study focus in this field. It also illustrates the need of big corpora or a benchmark in boosting Arabic language research for good human-computer interaction.A Sindhi Unicode-8-based linguistics data set is also multi-class and multi-featured [19]. It was created to address natural language processing (NLP) and linguistic issues in the Sindhi language.…”
Section: Literature Reviewmentioning
confidence: 99%
“…e margin loss was replaced by the computation of connectionist temporal classification (CTC). In another paper, Poncelet et al [10] used capsule networks with recurrent neural networks, additionally encoding time information-an essential property present in speech. ey applied this approach in the field of spoken language understanding (SLU).…”
Section: Capsule Networkmentioning
confidence: 99%
“…One important consideration in performing deep learning or machine learning in general is the choice of the loss function. Most of the capsule network implementations in other literature [9][10][11] use the original margin loss as described by Sabour et al [8]. Only a few have attempted deviating from the original implementation and instead have employed other loss functions.…”
Section: Introductionmentioning
confidence: 99%