2014
DOI: 10.1109/taslp.2014.2359628
|View full text |Cite
|
Sign up to set email alerts
|

Mixed Stereo Audio Classification Using a Stereo-Input Mixed-to-Panned Level Feature

Abstract: Many past studies have been conducted on speech/music discrimination due to the potential applications for broadcast and other media; however, it remains possible to expand the experimental scope to include samples of speech with varying amounts of background music. This paper focuses on the development and evaluation of two measures of the ratio between speech energy and music energy: a reference measure called speech-to-music ratio (SMR), which is known objectively only prior to mixing, and a feature called … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 40 publications
0
3
0
Order By: Relevance
“…Due to the potential uses for broadcast and other media, many previous studies on speech/music discrimination have been undertaken; however, it is still conceivable to broaden the experimental scope to include samples of speech with varied quantities of background music. A. Chen et.al in [7] have discussed the development and evaluation of two measurements of the ratio between speech and music energy are the subject of this paper: a feature called the stereo-input mix-to-peripheral level feature (SIMPL), which is computed from the stereo mixed signal as an approximate estimate of SMR, and a reference measure called speech-to-music ratio (SMR), which is known objectively only prior to mixing. SIMPL is an objective signal measure determined using broadcast mixing procedures, in which vocals, unlike most instruments, are often positioned at stereo centre.…”
Section: Reviewmentioning
confidence: 99%
See 2 more Smart Citations
“…Due to the potential uses for broadcast and other media, many previous studies on speech/music discrimination have been undertaken; however, it is still conceivable to broaden the experimental scope to include samples of speech with varied quantities of background music. A. Chen et.al in [7] have discussed the development and evaluation of two measurements of the ratio between speech and music energy are the subject of this paper: a feature called the stereo-input mix-to-peripheral level feature (SIMPL), which is computed from the stereo mixed signal as an approximate estimate of SMR, and a reference measure called speech-to-music ratio (SMR), which is known objectively only prior to mixing. SIMPL is an objective signal measure determined using broadcast mixing procedures, in which vocals, unlike most instruments, are often positioned at stereo centre.…”
Section: Reviewmentioning
confidence: 99%
“…Gaussian mixture models (GMM) are used in this work because they perform similarly to the best reported methods on similar regression and classification tasks and can be implemented using ordinary voice recognition software. The research carried out by A. Chen et.al in [7], established a simple audio categorization feature that approximates the ratio of energy present in recorded voice or speaking and instrumental portions based on their normal stereo mix placements. SIMPL demonstrated to be a valuable feature in speech/music discrimination applications, with 81.9 percent success rates for three-way classification and 97.9% for two-way classification.…”
Section: Reviewmentioning
confidence: 99%
See 1 more Smart Citation