Mixed Stereo Audio Classification Using a Stereo-Input Mixed-to-Panned Level Feature

Chen, Austin; Hasegawa‐Johnson, Mark

doi:10.1109/taslp.2014.2359628

Cited by 6 publications

(3 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Due to the potential uses for broadcast and other media, many previous studies on speech/music discrimination have been undertaken; however, it is still conceivable to broaden the experimental scope to include samples of speech with varied quantities of background music. A. Chen et.al in [7] have discussed the development and evaluation of two measurements of the ratio between speech and music energy are the subject of this paper: a feature called the stereo-input mix-to-peripheral level feature (SIMPL), which is computed from the stereo mixed signal as an approximate estimate of SMR, and a reference measure called speech-to-music ratio (SMR), which is known objectively only prior to mixing. SIMPL is an objective signal measure determined using broadcast mixing procedures, in which vocals, unlike most instruments, are often positioned at stereo centre.…”

Section: Reviewmentioning

confidence: 99%

“…Gaussian mixture models (GMM) are used in this work because they perform similarly to the best reported methods on similar regression and classification tasks and can be implemented using ordinary voice recognition software. The research carried out by A. Chen et.al in [7], established a simple audio categorization feature that approximates the ratio of energy present in recorded voice or speaking and instrumental portions based on their normal stereo mix placements. SIMPL demonstrated to be a valuable feature in speech/music discrimination applications, with 81.9 percent success rates for three-way classification and 97.9% for two-way classification.…”

Section: Reviewmentioning

confidence: 99%

“…This is done in [6] by implementing an optimized audio classification and segmentation algorithm by the use of the ensembled bagged trees. In [7], mixed stereo audio classification is doneusing a stereo-input mixed-to-panned level feature. The new measure called the speech to music ratio is also calculated here.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A Review on Machine Learning for Audio Applications

Nagesh¹,

Kumari²

2021

JUSST

View full text Add to dashboard Cite

Audio processing is an important branch under the signal processing domain. It deals with the manipulation of the audio signals to achieve a task like filtering, data compression, speech processing, noise suppression, etc. which improves the quality of the audio signal. For applications such as natural language processing, speech generation, automatic speech recognition, the conventional algorithms aren’t sufficient. There is a need for machine learning or deep learning algorithms which can be implemented so that the audio signal processing can be achieved with good results and accuracy. In this paper, a review of the various algorithms used by researchers in the past has been described and gives the appropriate algorithm that can be used for the respective applications.

show abstract

Section: Reviewmentioning

confidence: 99%

Section: Reviewmentioning

confidence: 99%