ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9414560
|View full text |Cite
|
Sign up to set email alerts
|

Emformer: Efficient Memory Transformer Based Acoustic Model for Low Latency Streaming Speech Recognition

Abstract: This paper proposes an efficient memory transformer Emformer for low latency streaming speech recognition. In Emformer, the longrange history context is distilled into an augmented memory bank to reduce self-attention's computation complexity. A cache mechanism saves the computation for the key and value in self-attention for the left context. Emformer applies a parallelized block processing in training to support low latency models. We carry out experiments on benchmark LibriSpeech data. Under average latency… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
44
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 105 publications
(44 citation statements)
references
References 34 publications
0
44
0
Order By: Relevance
“…We focus on building models for low latency streaming ondevice speech recognition using the Emformer [2] transducer. Emformer is an efficient extension of the Augmented Memory Transformer (AM-TRF) [1].…”
Section: Low Latency Emformer Transducermentioning
confidence: 99%
See 4 more Smart Citations
“…We focus on building models for low latency streaming ondevice speech recognition using the Emformer [2] transducer. Emformer is an efficient extension of the Augmented Memory Transformer (AM-TRF) [1].…”
Section: Low Latency Emformer Transducermentioning
confidence: 99%
“…Since we perform experiments for low latency conditions, the center chunk size and look-ahead context size in Emformer are set to 160ms and 40ms, respectively. The algorithmic latency [2] of the acoustic encoders is 120 ms.…”
Section: Datasets and Setupmentioning
confidence: 99%
See 3 more Smart Citations