2009 IEEE International Conference on Acoustics, Speech and Signal Processing 2009
DOI: 10.1109/icassp.2009.4959505
|View full text |Cite
|
Sign up to set email alerts
|

Unified speech and audio coding scheme for high quality at low bitrates

Abstract: Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the cod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2009
2009
2015
2015

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 34 publications
(22 citation statements)
references
References 5 publications
0
22
0
Order By: Relevance
“…In USAC [34], an up-to-date MPEG standardization, MDCT plays an important role [35]. In the USAC encoder, the MDCT coefficients are firstly companded with a power low function before scalar quantization, achieving in effect a non-uniform scalar quantization.…”
Section: Resultsmentioning
confidence: 99%
“…In USAC [34], an up-to-date MPEG standardization, MDCT plays an important role [35]. In the USAC encoder, the MDCT coefficients are firstly companded with a power low function before scalar quantization, achieving in effect a non-uniform scalar quantization.…”
Section: Resultsmentioning
confidence: 99%
“…However, previous research [5], [6] poor coding gain is that the conventional DFT-based TCX is not based on critical sampling, which causes low-frequency resolution and overhead data during the core-coding transitions. Another problem associated with AMR-WB+ TCX is the block artifact, which is caused by the short overlap between TCX frames.…”
Section: Amr-wb+ Tcxmentioning
confidence: 99%
“…On the other hand, HE-AAC does not perform well for speech signals, since it can not use a small bit budget as efficiently as linear predictive (LP) coders when encoding speech [5], [6]. At 16∼20 kbps, the music quality of the AMR-WB+ is significantly worse than that of the HE-AAC v2 [6]. One of the major reasons is overhead information, particularly during the core-coding transitions, due to non-critical sampling with a low-frequency resolution.…”
Section: Introductionmentioning
confidence: 99%
“…The FD and LPD core modules process music-and speech-like input signals, respectively. The FD/LPD core modules are controlled by a signal classifier, and thus the performance of the USAC system depends heavily on the performance of the signal classifier tool [2], [7]. In this letter, we propose an LPD single-mode USAC system that does not require a signal classifier.…”
Section: Introductionmentioning
confidence: 99%