1997 IEEE Workshop on Speech Coding for Telecommunications Proceedings. Back to Basics: Attacking Fundamental Problems in Speec
DOI: 10.1109/scft.1997.623920
|View full text |Cite
|
Sign up to set email alerts
|

Perceptual coding of narrowband audio signals at 8 kbit/s

Abstract: New applications such as Internet broadcast and communications, consumer multimedia products, digital AM broadcast and satellite networks are emerging. Those applications require moderate audio quality without annoying artifacts at bit rates below 16 kbit/s. Although speech coders provide high speech quality at bit rates around 8 kbit/s, they perform poorly when encoding audio signals. In this thesis, we present a novel transform coding paradigm based on the characteristics of the human hearing system. The pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 79 publications
0
6
0
Order By: Relevance
“…When the radius reaches zero and the neighborhood contains only the winning node itself, then the training stops after reaching equilibrium. The learning rate is given by (15) where is the SMR value expressed in log scale and it is used to compute the learning rate for all components in the th subvector. The term that depends on is in the form of a sigmoidal function, which saturates to 1 if is large and to zero is is small.…”
Section: Perceptually Weighted Btsofmmentioning
confidence: 99%
See 1 more Smart Citation
“…When the radius reaches zero and the neighborhood contains only the winning node itself, then the training stops after reaching equilibrium. The learning rate is given by (15) where is the SMR value expressed in log scale and it is used to compute the learning rate for all components in the th subvector. The term that depends on is in the form of a sigmoidal function, which saturates to 1 if is large and to zero is is small.…”
Section: Perceptually Weighted Btsofmmentioning
confidence: 99%
“…Subjective evaluation showed that the sound quality of the decoded audio of TWIN-VQ exceeds that of the MPEG1 Layer II coder at the same bit rate [14]. Several other reports also showed the advantages of vector quantization in audio coding [15]- [19]. However, none of these methods take psychoacoustic effects into account during codebook design and vector encoding.…”
mentioning
confidence: 99%
“…For any audible spectral component, the error energy due to a phase error must be below the masking threshold as follows. (7) where A and φ are the amplitude and phase of the component, φ is the phase error and mth is the correspond-ing masking threshold. The worst case occurs when the cosine function has the highest rate of variation that is when φ = π 2 .…”
Section: Upper Bound For Phase Errorsmentioning
confidence: 99%
“…To accomplish this goal, we have developed a new audio coding structure based on the characteristics of the human hearing system. The proposed coder, which is referred to as the Narrowband Perceptual Audio Coder ( NPAC), provides moderate quality for narrowband (4 kHz bandwidth) audio inputs at bit rates down to 8 kbit/s [33,34,35]. The proposed coder employs a number of different coding techniques which are described in this thesis.…”
Section: Thesis Contributionsmentioning
confidence: 99%
“…We use a modified version of the LBG algorithm [102] with the following perceptuallybased distortion measure based on the audible noise energy to design the codebooks [33]. The same error criterion is used to select the best codewords in encoding the input vectors.…”
Section: Perceptually Trained Vqmentioning
confidence: 99%