2006
DOI: 10.1109/tmm.2006.884611
|View full text |Cite
|
Sign up to set email alerts
|

Combining Media-Specific FEC and Error Concealment for Robust Distributed Speech Recognition Over Loss-Prone Packet Channels

Abstract: Abstract-This paper presents a mixed recovery scheme for robust distributed speech recognition (DSR) implemented over a packet channel which suffers packet losses. The scheme combines media-specific forward error correction (FEC) and error concealment (EC). Media-specific FEC is applied at the client side, where FEC bits representing strongly quantized versions of the speech vectors are introduced. At the server side, the information provided by those FEC bits is used by the EC algorithm to improve the recogni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2007
2007
2020
2020

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 20 publications
(6 citation statements)
references
References 20 publications
0
6
0
Order By: Relevance
“…In general, SPQER is not bound to a specific S2T API. It would also be possible to use IBM Watson 4 . A free software alternative is Kaldi [16].…”
Section: A Word Recognitionmentioning
confidence: 99%
See 1 more Smart Citation
“…In general, SPQER is not bound to a specific S2T API. It would also be possible to use IBM Watson 4 . A free software alternative is Kaldi [16].…”
Section: A Word Recognitionmentioning
confidence: 99%
“…WER There are several publications dedicated to the effect of packet loss on speech recognition, i.e., [3] or [13]. Other research focuses on the improvement of speech recognition in lossy scenarios by using FEC algorithms [4] [1]. All these studies aimed to train better speech recognition models with improved robustness against packet loss.…”
Section: B Text-based Speech Recognition Metricsmentioning
confidence: 99%
“…Initially, we can use the same distribution presented in previous papers (Peinado et al, 2005;Gómez et al, 2006), where vectors (replicas) corresponding to the frames located T f ec time units before and after the current frame pair are also included within the packet. Figure 4 depicts an example of this scheme.…”
Section: Introduction Of Fec Codesmentioning
confidence: 99%
“…In our case, each replica, containing the 14 features (13 MFCCs plus log-Energy) is vector quantized (VQ) using a codebook with 2 N codewords (N bits). In our previous works (Peinado et al, 2005;Gómez et al, 2006), VQ codebooks were trained using the k-means algorithm over the speech recognizer training database. Although different sets were used for training and testing, replicas could be over-adapted to the database and, in particular to its vocabulary.…”
Section: Introduction Of Fec Codesmentioning
confidence: 99%
“…It is worth highlighting that SWV allows the interaction between the language and acoustic models in ASR just like in human perception: the language model has a higher weight in those frames with low SNR or low reliability (Yoma et al, 2003-B). Finally, the concept of uncertainty in noise canceling and weighted recognition algorithms (Yoma et al, 1995;1996-A;1996-B;1997-A;1997-B;1998-A;1998-B;1998-C;1999) have also widely been employed elsewhere in the fields of ASR and SV in later publications (Acero et al, 2006-A;2006-B;Arrowood & Clements, 2004;Bernard & Alwan, 2002;Breton, 2005;Chan & Siu, 2004;Cho et al, 2002;Delaney, 2005;Deng, et al, 2005;Erzin et al, 2005;Gomez et al, 2006;Hung et al, 1998;Keung et al, 2000;Kitaoka & Nakagawa, 2002;Li, 2003;Liao & Gales, 2005 ;Pfitzinger, 2000;Pitsikalis et al, 2006;Tan et al, 2005;Vildjiounaite et al, 2006;Wu & Chen, 2001).…”
mentioning
confidence: 99%