Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1204
|View full text |Cite
|
Sign up to set email alerts
|

Compact Feedforward Sequential Memory Networks for Small-footprint Keyword Spotting

Abstract: Due to limited resource on devices and complicated scenarios, a compact model with high precision, low computational cost and latency is expected for small-footprint keyword spotting tasks. To fulfill these requirements, in this paper, compact Feedforward Sequential Memory Network (cFSMN) which combines low-rank matrix factorization with conventional FSMN is investigated for a far-field keyword spotting task. The effect of its architecture parameters is analyzed. Towards achieving lower computational cost, mul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(11 citation statements)
references
References 24 publications
(24 reference statements)
0
11
0
Order By: Relevance
“…Here, we address the problem of keyword spotting in a more challenging setting with competing talkers. Recent studies tackle KWS for both efficient computation [4] and small footprint [6,7], while not many studies address these problems in the context of SV [3,2]. Efforts exist for either jointly solving both tasks [5] or solving a single task in the presence of background noise [10].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Here, we address the problem of keyword spotting in a more challenging setting with competing talkers. Recent studies tackle KWS for both efficient computation [4] and small footprint [6,7], while not many studies address these problems in the context of SV [3,2]. Efforts exist for either jointly solving both tasks [5] or solving a single task in the presence of background noise [10].…”
Section: Related Workmentioning
confidence: 99%
“…Our proposed model for KWS is competitive with state-of-theart models, e.g. [4,5,6]. It uses spike count frames that only incur a feature computation delay of 5 ms while more traditional methods that use log-filterbank features usually incur a feature computation delay of 30 − 40 ms.…”
Section: Resource Requirementsmentioning
confidence: 99%
See 1 more Smart Citation
“…They collect numerous variations of a specific keyword utterance and train neural networks (NNs) which have been promising method in the field. [1,2] have acoustic encoder and sequence matching decoder as separate modules. The NN-based acoustic models (AMs) predict senone-level Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc posteriors.…”
Section: Introductionmentioning
confidence: 99%
“…Compared to RNN and CNN, cFSMN can achieve comparable performance while less in model parameters and more efficient in computation. Thereby, it is promoted to many other tasks, such as text to speech (TTS) [26] and smaller footprint keyword spotting (KWS) [27].…”
Section: Introductionmentioning
confidence: 99%