Improving Singing Voice Separation Using Curriculum Learning on Recurrent Neural Networks

Kang, Seungtae; Park, Jeong-Sik; Jang, Gil-Jin

doi:10.3390/app10072465

Cited by 3 publications

(3 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For singing voice separation, in [47], a curriculum learning approach was considered, where the training begins with easy examples and the difficulty is steadily increased. Three different databases were tested: MIR-1K [48], ccMixter [49], and MUSDB18 [50], with the model yielding improved performance with respect to the global normalized source distortion ratio measure.…”

Section: B Recurrent Neural Network (Rnn)mentioning

confidence: 99%

Music Deep Learning: Deep Learning Methods for Music Signal Processing—A Review of the State-of-the-Art

et al. 2023

View full text Add to dashboard Cite

The discipline of Deep Learning has been recognized for its strong computational tools, which have been extensively used in data and signal processing, with innumerable promising results. Among the many commercial applications of Deep Learning, Music Signal Processing has received an increasing amount of attention over the last decade. This work reviews the most recent developments in Deep Learning in Music signal processing. Two main applications that are discussed are Music Information Retrieval, which spans a plethora of applications, and Music Generation, which can fit a range of musical styles. After a review of both topics, several emerging directions are identified for future research.

show abstract

Section: B Recurrent Neural Network (Rnn)mentioning

confidence: 99%

Music Deep Learning: Deep Learning Methods for Music Signal Processing—A Review of the State-of-the-Art

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Kang et al [175] proposed a singing voice separation approach based on the curriculum learning framework, in which learning is started with only easy examples and then the task difficulty is gradually increased. They define easy examples as the ones in which one source is obviously dominant over the other, where the dominance factor depends on the relative intensity of vocals and instruments.…”

Section: Singing Voice Separationmentioning

confidence: 99%

Deep Learning Approaches in Topics of Singing Information Processing

Gupta

Goto

2022

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Singing, the vocal production of musical tones, is one of the most important elements of music. Addressing the needs of real-world applications, the study of technologies related to singing voices has become an increasingly active area of research. In this paper, we provide a comprehensive overview of the recent developments in the field of singing information processing, specifically in the topics of singing skill evaluation, singing voice synthesis, singing voice separation, and lyrics synchronization and transcription. We will especially focus on deep learning approaches including modern representation learning techniques for singing voices. We will also provide an overview of contributions in public datasets for singing voice research.

show abstract

“…The second approach develops supervised models based on the task optimization on a specific training dataset. Thanks to the development of deep learning, recently proposed deep source separation systems have made significant progress [10,[14][15][16], which has begun to perform at the human level for natural source separation. The second approach has attracted significant attention because of its excellent performance, and it is also used here to solve the problem of two source separation from monaural recordings.…”

Section: Introductionmentioning

confidence: 99%

Sound Source Separation Mechanisms of Different Deep Networks Explained from the Perspective of Auditory Perception

et al. 2022

View full text Add to dashboard Cite

Thanks to the development of deep learning, various sound source separation networks have been proposed and made significant progress. However, the study on the underlying separation mechanisms is still in its infancy. In this study, deep networks are explained from the perspective of auditory perception mechanisms. For separating two arbitrary sound sources from monaural recordings, three different networks with different parameters are trained and achieve excellent performances. The networks’ output can obtain an average scale-invariant signal-to-distortion ratio improvement (SI-SDRi) higher than 10 dB, comparable with the human performance to separate natural sources. More importantly, the most intuitive principle—proximity—is explored through simultaneous and sequential organization experiments. Results show that regardless of network structures and parameters, the proximity principle is learned spontaneously by all networks. If components are proximate in frequency or time, they are not easily separated by networks. Moreover, the frequency resolution at low frequencies is better than at high frequencies. These behavior characteristics of all three networks are highly consistent with those of the human auditory system, which implies that the learned proximity principle is not accidental, but the optimal strategy selected by networks and humans when facing the same task. The emergence of the auditory-like separation mechanisms provides the possibility to develop a universal system that can be adapted to all sources and scenes.

show abstract

Improving Singing Voice Separation Using Curriculum Learning on Recurrent Neural Networks

Cited by 3 publications

References 27 publications

Music Deep Learning: Deep Learning Methods for Music Signal Processing—A Review of the State-of-the-Art

Music Deep Learning: Deep Learning Methods for Music Signal Processing—A Review of the State-of-the-Art

Deep Learning Approaches in Topics of Singing Information Processing

Sound Source Separation Mechanisms of Different Deep Networks Explained from the Perspective of Auditory Perception

Contact Info

Product

Resources

About