ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682389
|View full text |Cite
|
Sign up to set email alerts
|

A Streamlined Encoder/decoder Architecture for Melody Extraction

Abstract: Melody extraction in polyphonic musical audio is important for music signal processing. In this paper, we propose a novel streamlined encoder/decoder network that is designed for the task. We make two technical contributions. First, drawing inspiration from a state-of-the-art model for semantic pixelwise segmentation, we pass through the pooling indices between pooling and un-pooling layers to localize the melody in frequency. We can achieve result close to the state-of-the-art with much fewer convolutional la… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
80
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 52 publications
(82 citation statements)
references
References 16 publications
2
80
0
Order By: Relevance
“…We compared our best melody extraction model, JDC S (o mv ), with state-of-the-art methods using deep neural networks [17,18,21,22]. For a comparison of results under the same conditions, the test sets were ADC04, MIREX05, and MedleyDB for comparing other methods as mentioned in Section 3.1.2.…”
Section: Comparison With State-of-the-art Methods For Melody Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…We compared our best melody extraction model, JDC S (o mv ), with state-of-the-art methods using deep neural networks [17,18,21,22]. For a comparison of results under the same conditions, the test sets were ADC04, MIREX05, and MedleyDB for comparing other methods as mentioned in Section 3.1.2.…”
Section: Comparison With State-of-the-art Methods For Melody Extractionmentioning
confidence: 99%
“…Researchers have attempted various deep neural network architectures for melody extraction. Examples include fully-connected neural networks (FNN) [15,16], convolutional neural networks (CNN) [17,18], recurrent neural networks (RNN) [19], convolutional recurrent neural networks (CRNN) [20], and encoder-decoder [21,22].…”
Section: Introductionmentioning
confidence: 99%
“…Existing melody extraction algorithms can be roughly divided into three frameworks, i.e., pitch-salience based [2], source separation based [3,4] and data-driven based methods [5][6][7]. Source separation based methods have more potential to overcome the above difficulties and substantially foster the advances of melody extraction.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, encoder-decoder architecture has demonstrated its powerful performance for VME. Lu et al [15] adopted an encoder-decoder network with dilated convolutions and Hsieh et al [6] constructed an encoder-decoder network with pooling indices. Simulating the process of semantic segmentation, they took the combined frequency and periodicity representation as inputs and outputted a two-dimensional salience image where frequency bins with maximum values per frame were selected.…”
Section: Introductionmentioning
confidence: 99%
“…Lu and Su addressed the melody extraction problem from the semantic segmentation on a time-frequency image perspective [15]. Afterwards, following Lu and Su's work, Hsieh et al added links between the pooling layers of the encoder and the un-pooling layers of the decoder to reduce convolution layers and simplify convolution modules [16]. The deep learning-based methods can automatically learn high-level features, according to the training data.…”
Section: Introductionmentioning
confidence: 99%