Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1877
|View full text |Cite
|
Sign up to set email alerts
|

Binary Speech Features for Keyword Spotting Tasks

Abstract: Keyword spotting is a classification task which aims to detect a specific set of spoken words. In general, this type of task runs on a power-constrained device such as a smartphone. One method to reduce the power consumption of a keyword spotting algorithm (typically a neural network) is to reduce the precision of the network weights and activations. In this paper, we propose a new representation of speech features which is more adapted to low-precision networks and compatible with binary/ternary neural networ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(11 citation statements)
references
References 11 publications
0
10
0
Order By: Relevance
“…The same philosophy can be applied to speech features. Emerging research [69] studies two kinds of low-precision speech representations: linearly-quantized log-Mel spectrogram and power variation over time, derived from log-Mel spectrogram, represented by only 2 bits. Experimental results show that using 8-bit log-Mel spectra yields same KWS accuracy as employing full-precision MFCCs.…”
Section: Low-precision Featuresmentioning
confidence: 99%
See 4 more Smart Citations
“…The same philosophy can be applied to speech features. Emerging research [69] studies two kinds of low-precision speech representations: linearly-quantized log-Mel spectrogram and power variation over time, derived from log-Mel spectrogram, represented by only 2 bits. Experimental results show that using 8-bit log-Mel spectra yields same KWS accuracy as employing full-precision MFCCs.…”
Section: Low-precision Featuresmentioning
confidence: 99%
“…Furthermore, KWS performance degradation is insignificant when exploiting 2bit precision speech features. As the authors of [69] state, this fact might indicate that much of the spectral information is superfluous when attempting to spot a set of keywords. In [82], we independently arrived at the same finding.…”
Section: Low-precision Featuresmentioning
confidence: 99%
See 3 more Smart Citations