2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World Of
DOI: 10.1109/icme.2000.869589
|View full text |Cite
|
Sign up to set email alerts
|

Towards efficient and scalable speech compression schemes for robust speech recognition applications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 5 publications
0
4
0
Order By: Relevance
“…6 and 7 is that the bitrate required to ensure that speech recognition performance is not degraded due to compression is about 4600 b/s for the spoken names task and 5700 b/s for the CSR task. Additionally, it has been shown that the approximate minimum bitrate for transparent operation for an isolated digits task was 1100 b/s (Srinivasamurthy et al, 2001a) and for a connected digits task was 2000 b/s (Srinivasamurthy et al, 2001b). This illustrates that the minimum bitrate for transparent speech recognition is strongly task dependent.…”
Section: Continuous Speech Recognitionmentioning
confidence: 98%
See 2 more Smart Citations
“…6 and 7 is that the bitrate required to ensure that speech recognition performance is not degraded due to compression is about 4600 b/s for the spoken names task and 5700 b/s for the CSR task. Additionally, it has been shown that the approximate minimum bitrate for transparent operation for an isolated digits task was 1100 b/s (Srinivasamurthy et al, 2001a) and for a connected digits task was 2000 b/s (Srinivasamurthy et al, 2001b). This illustrates that the minimum bitrate for transparent speech recognition is strongly task dependent.…”
Section: Continuous Speech Recognitionmentioning
confidence: 98%
“…Due to this overlap and the underlying correlation in the speech (because of the slow movement of articulators), it is reasonable to expect that MFCC vectors from adjacent frames will exhibit high correlation. To achieve good compression efficiency this correlation has been exploited using linear prediction (Ramaswamy and Gopalakrishnan, 1998;Srinivasamurthy et al, 2000), where a given MFCC in a frame was predicted from the corresponding MFCC in the previous frame. 2 The prediction error e i ¼ u i À aû iÀ1 was quantized using uniform scalar quantization (USQ), where u i is the current sample andû iÀ1 is the reconstruction of the previous sample generated by the coarse prediction loop.…”
Section: Scalable Encodingmentioning
confidence: 99%
See 1 more Smart Citation
“…Another approach consists in providing each processing task with the most relevant information in order to maximize its classification accuracy. One potential solution to this problem has been addressed in [26] in the context of scalable speech recognition. The authors considered two sequential speech recognition systems with very different resource requirements.…”
Section: Discussionmentioning
confidence: 99%