2021
DOI: 10.3390/electronics10070810
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of Deep Learning Methods for Timbre Analysis in Polyphonic Automatic Music Transcription

Abstract: Automatic music transcription (AMT) is a critical problem in the field of music information retrieval (MIR). When AMT is faced with deep neural networks, the variety of timbres of different instruments can be an issue that has not been studied in depth yet. The goal of this work is to address AMT transcription by analyzing how timbre affect monophonic transcription in a first approach based on the CREPE neural network and then to improve the results by performing polyphonic music transcription with different t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 23 publications
(6 citation statements)
references
References 21 publications
0
6
0
Order By: Relevance
“…Eval. We can evaluate music generation from a subjective perspective [22] or an objective method. This submodule corresponds to the implementation of Yang and Lerch's evaluation method [23] for the objective evaluation of music generation.…”
Section: Discussionmentioning
confidence: 99%
“…Eval. We can evaluate music generation from a subjective perspective [22] or an objective method. This submodule corresponds to the implementation of Yang and Lerch's evaluation method [23] for the objective evaluation of music generation.…”
Section: Discussionmentioning
confidence: 99%
“…The difference between RNN and the traditional NN is that RNN has the concept of timing, and the state of the next moment will be affected by the current state. Some researchers also call recurrent networks deep networks, whose depth can be shown in input, output, and time-depth (Hernandez-Olivan et al, 2021 ; Parmiggiani et al, 2021 ). The RNN structure is given in Figure 2 .…”
Section: Design Of Music Style Recognition Model and Construction Of ...mentioning
confidence: 99%
“…This is important for problems such as Automatic Music Transcription (AMT) [8]. It has been identified that, for audio recordings, it is necessary to have efficient systems that identify the different musical timbres with high precision and quantitatively [9].…”
Section: Introductionmentioning
confidence: 99%
“…that can be computationally extracted from the statistical analysis in the digitization of the spectrum (FFT). This focuses mainly on the statistical and mathematical characterization of the maximums in the FFT, such as the mean value in frequency (centroid) and amplitude, standard deviation, kurtosis, roots or poles of the distribution, arithmetic and geometric sequences in frequency, mean values and mean-squares of the amplitudes, among others [1,9,[16][17][18][19][20][21]. Currently, there is no consensus on which and how many acoustic descriptors are to characterize musical timbre.…”
Section: Introductionmentioning
confidence: 99%