2018
DOI: 10.48550/arxiv.1811.00348
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sequence-to-sequence Models for Small-Footprint Keyword Spotting

Haitong Zhang,
Junbo Zhang,
Yujun Wang

Abstract: In this paper, we propose a sequence-to-sequence model for keyword spotting (KWS). Compared with other end-to-end architectures for KWS, our model simplifies the pipelines of production-quality KWS system and satisfies the requirement of high accuracy, low-latency, and small-footprint. We also evaluate the performances of different encoder architectures, which include LSTM and GRU. Experiments on the real-world wake-up data show that our approach outperforms the recently proposed attention-based end-toend mode… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…A number of works have since used deep architectures suitable for sequence modelling (e.g. RNNs, CNNs, or graph convolutional networks) [6,15,23,31,36,38,45,52,60,64,76], including encoder-decoder approaches [8,51,73,78]. Berg et al [9] recently proposed using a Transformer model for the same task.…”
Section: Related Workmentioning
confidence: 99%
“…A number of works have since used deep architectures suitable for sequence modelling (e.g. RNNs, CNNs, or graph convolutional networks) [6,15,23,31,36,38,45,52,60,64,76], including encoder-decoder approaches [8,51,73,78]. Berg et al [9] recently proposed using a Transformer model for the same task.…”
Section: Related Workmentioning
confidence: 99%
“…In order to generate background noise, we randomly sample and crop background noises provided in the dataset. For a fair comparison, in our test set, the "silence" class test samples are taken from open source speech commands dataset test set version 2 3 [19], and test samples of other classes are written in the officially released testing.list 12 .…”
Section: Datasetsmentioning
confidence: 99%
“…On the other hand, Deep neural networks (DNNs) have recently proven to yield efficient small-footprint solutions for KWS [8,9,10,11,12,13,14,15,16]. In particular, more advanced architectures, such as Convolutional Neural Networks (CNNs), have been applied to solve KWS problems under limited memory footprint as well as computational resource scenarios, showing excellent accuracy.…”
Section: Introductionmentioning
confidence: 99%
“…RNNs are also combined with convolutional layers [7,25,27] to simultaneously model local features and temporal dependencies. Recent works also explore seq2seq models for KWS [9,31,45,47].…”
Section: Related Workmentioning
confidence: 99%