2021
DOI: 10.3390/info12040165
|View full text |Cite
|
Sign up to set email alerts
|

A 2D Convolutional Gating Mechanism for Mandarin Streaming Speech Recognition

Abstract: Recent research shows recurrent neural network-Transducer (RNN-T) architecture has become a mainstream approach for streaming speech recognition. In this work, we investigate the VGG2 network as the input layer to the RNN-T in streaming speech recognition. Specifically, before the input feature is passed to the RNN-T, we introduce a gated-VGG2 block, which uses the first two layers of the VGG16 to extract contextual information in the time domain, and then use a SEnet-style gating mechanism to control what inf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 20 publications
0
1
0
Order By: Relevance
“…Organizing and conducting manual testing of Putonghua proficiency consumes a lot of human and material resources, and it is impossible to avoid the influence of subjective factors in scoring. Therefore, many researchers have carried out studies on objective testing of Putonghua proficiency with the help of computers, and a lot of progress has been made [1][2][3].…”
Section: Introductionmentioning
confidence: 99%
“…Organizing and conducting manual testing of Putonghua proficiency consumes a lot of human and material resources, and it is impossible to avoid the influence of subjective factors in scoring. Therefore, many researchers have carried out studies on objective testing of Putonghua proficiency with the help of computers, and a lot of progress has been made [1][2][3].…”
Section: Introductionmentioning
confidence: 99%