Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-427
|View full text |Cite
|
Sign up to set email alerts
|

Transformer-Based End-to-End Speech Recognition with Residual Gaussian-Based Self-Attention

Abstract: Self-attention (SA), which encodes vector sequences according to their pairwise similarity, is widely used in speech recognition due to its strong context modeling ability. However, when applied to long sequence data, its accuracy is reduced. This is caused by the fact that its weighted average operator may lead to the dispersion of the attention distribution, which results in the relationship between adjacent signals ignored. To address this issue, in this paper, we introduce relativeposition-awareness self-a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…The Conformer [5,6] contrast proposed by Anmol Gulati combines convolution combines convolution with self-attention, where self-attention learns global interactions and convolution effectively captures local correlations based on relative offsets, resulting in more efficient results than using convolution or self-attention alone.…”
Section: Conformermentioning
confidence: 99%
“…The Conformer [5,6] contrast proposed by Anmol Gulati combines convolution combines convolution with self-attention, where self-attention learns global interactions and convolution effectively captures local correlations based on relative offsets, resulting in more efficient results than using convolution or self-attention alone.…”
Section: Conformermentioning
confidence: 99%
“…(b) Parallel computing capability: The self-attention mechanism in the transformer enables direct interaction between representations at each input position and all other positions, thereby facilitating highly parallelized computation [43]. Traditional recurrent neural network (RNN) models may encounter computational efficiency limitations when handling lengthy sequences in air-writing.…”
Section: ) Transformer Modelmentioning
confidence: 99%