2021 IEEE Spoken Language Technology Workshop (SLT) 2021
DOI: 10.1109/slt48900.2021.9383559
|View full text |Cite
|
Sign up to set email alerts
|

A Light Transformer For Speech-To-Intent Applications

Abstract: Spoken language understanding (SLU) systems can make life more agreeable, safer (e.g. in a car) or can increase the independence of physically challenged users. However, due to the many sources of variation in speech, a well-trained system is hard to transfer to other conditions like a different language or to speech impaired users. A remedy is to design a user-taught SLU system that can learn fully from scratch from users' demonstrations, which in turn requires that the system's model quickly converges after … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
12
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

5
0

Authors

Journals

citations
Cited by 5 publications
(12 citation statements)
references
References 12 publications
0
12
0
Order By: Relevance
“…The light transformer is a light version of the vanilla transformer which includes a low-dimensional (light) relative position encoding (PE) matrix [23]. We will shortly recap vanilla transformers first [24].…”
Section: The Baseline Light Transformermentioning
confidence: 99%
See 3 more Smart Citations
“…The light transformer is a light version of the vanilla transformer which includes a low-dimensional (light) relative position encoding (PE) matrix [23]. We will shortly recap vanilla transformers first [24].…”
Section: The Baseline Light Transformermentioning
confidence: 99%
“…To account for order information, the vanilla transformer will add a d-dimensional absolute PE to the content embedding x which requires the network to learn in which subspace relevant data variation occurs and in which subspace position is represented. To avoid the hassle brought by the additive highdimensional PE, we introduced 6-dimensional relative position encoding in [23]. It is defined by Eq.…”
Section: The Baseline Light Transformermentioning
confidence: 99%
See 2 more Smart Citations
“…Training data is very scarce for several reasons including the increased effort required to collect data from this population [4,5]. [6,7,8] present a remedy for low resource SLU by designing a user-taught SLU system. "User-taught" refers to the strategy in which the SLU agent learns from scratch only based on spoken commands and corresponding task demonstrations from its users.…”
Section: Introductionmentioning
confidence: 99%