Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1925
|View full text |Cite
|
Sign up to set email alerts
|

On the Importance of Audio-Source Separation for Singer Identification in Polyphonic Music

Abstract: Singer identification is to automatically identify the singer in a music recording, such as a polyphonic song. A song has two major acoustic components that are singing vocals and background accompaniment. Although identifying singers is similar to speaker identification, it is challenging due to the interference of background accompaniment on the singer-specific information in singing vocals. We believe that separating the background accompaniment from the singing vocal will help us to overcome the interferen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 18 publications
(14 citation statements)
references
References 25 publications
0
14
0
Order By: Relevance
“…While the idea of using SS to improve SID has been attempted before [11,[14][15][16], our work differs from the prior arts in two ways. First, except for the concurrent work [16], the prior arts that we are aware of did not use deep learningbased SS models. In contrast, in our work both the SS model and the SID model employ deep learning.…”
Section: Conv Blockmentioning
confidence: 99%
See 1 more Smart Citation
“…While the idea of using SS to improve SID has been attempted before [11,[14][15][16], our work differs from the prior arts in two ways. First, except for the concurrent work [16], the prior arts that we are aware of did not use deep learningbased SS models. In contrast, in our work both the SS model and the SID model employ deep learning.…”
Section: Conv Blockmentioning
confidence: 99%
“…Second, unlike prior arts (including [16]), we investigate one additional way to employ SS to improve SID. Given the separated vocal tracks and instrumental tracks of the audio recordings in the training set, we perform the so-called "data augmentation" [19][20][21][22] by randomly shuffling the separated tracks of different songs and then remixing them.…”
Section: Conv Blockmentioning
confidence: 99%
“…Research focuses of existing SID approaches can be grouped into three categories: i) simply considering as an issue of the speaker identification (SPID) [3], ii) directly identifying singers ignoring the influence of background music for the singer voice [4,5], and iii) distinguishing singers by voice characteristics after removing the interference of background music [6]. Hamid et al [3] used i-vector that is first introduced in SPID task, which aims to extract songlevel descriptors built from frame-level timbre features.…”
Section: Introductionmentioning
confidence: 99%
“…Some new methods are proposed using singing voice separation [8,9] as pre-processing. Sharma et al [6] extract the singing vocals from polyphonic songs using Wave-U-Net based approach to overcome the interference of background accompaniment, which outperforms the baseline without audio source separation by a large margin. Our method belongs to the second category.…”
Section: Introductionmentioning
confidence: 99%
“…Research focuses of existing SID approaches can be grouped into three categories: i) simply considering as an issue of the speaker identification (SPID) [3], ii) directly identifying singers ignoring the influence of background music for the singer voice [4,5], and iii) distinguishing singers by voice characteristics after removing the interference of background music [6]. Hamid et al [3] used i-vector that is first introduced in SPID task, which aims to extract song-level descriptors built from frame-level timbre features.…”
Section: Introductionmentioning
confidence: 99%