Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-1193
|View full text |Cite
|
Sign up to set email alerts
|

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
39
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
2

Relationship

1
9

Authors

Journals

citations
Cited by 54 publications
(39 citation statements)
references
References 0 publications
0
39
0
Order By: Relevance
“…In this work, we propose a switching method between observed mixture and enhanced speech for overlapping speech. Similarly, a preceding work called Voice Filter Light [19] switched observed mixture and enhanced speech to improve ASR results.…”
Section: Related Workmentioning
confidence: 99%
“…In this work, we propose a switching method between observed mixture and enhanced speech for overlapping speech. Similarly, a preceding work called Voice Filter Light [19] switched observed mixture and enhanced speech to improve ASR results.…”
Section: Related Workmentioning
confidence: 99%
“…This helps reduce computational cost and energy consumption, particularly in scenarios where a keyword detector is not preferable. VoiceFilter-Lite [17] is a singlechannel source separation model that runs on-device to preserve only the speech signals from a target user as part of a streaming speech recognition system. Similarly, Xue et al in [18] propose a method called speaker tracing buffer, which can track speaker information consistently across the chunk by extending a selfattention mechanism to maintain the speaker permutation information determined in previous chunks.…”
Section: Speech Separation and Discretizationmentioning
confidence: 99%
“…Morover, the non-causal nature of the convolutions and the bidirectional recurrent units makes these aforementioned approaches unsuitable for real-time, low-complexity applications. Recently, Voicefilter-lite, a real-time alternative to the Voicefilter has been proposed [16] to improve the performance of speech recognition systems in multi-talker situations. Although Voicefilter-lite showed impressive performance for overlapped speech recognition, it was not designed to improve human perception or intelligibility under such conditions, which is the need of the hour for real-time audio communication systems.…”
Section: Introductionmentioning
confidence: 99%