2018
DOI: 10.1080/09298215.2018.1458885
|View full text |Cite
|
Sign up to set email alerts
|

Convolution-based classification of audio and symbolic representations of music

Abstract: We present a novel convolution-based method for classification of audio and symbolic representations of music, which we apply to classification of music by style. Pieces of music are first sampled to pitch-time representations (spectrograms or piano-rolls), and then convolved with a Gaussian filter, before being classified by a support vector machine or by k-nearest neighbours in an ensemble of classifiers. On the well-studied task of discriminating between string quartet movements by Haydn and Mozart we obtai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 26 publications
0
7
0
Order By: Relevance
“…These sources were applied for genre recognition [18][19][20][21], mood and emotion recognition [22][23][24][25], artist identification [26], hit song prediction [27], and playlist prediction [28]. Audio and symbolic features were used for genre recognition [29,30]. Audio and images were employed for mood prediction [31] and genre recognition [12].…”
Section: Related Workmentioning
confidence: 99%
“…These sources were applied for genre recognition [18][19][20][21], mood and emotion recognition [22][23][24][25], artist identification [26], hit song prediction [27], and playlist prediction [28]. Audio and symbolic features were used for genre recognition [29,30]. Audio and images were employed for mood prediction [31] and genre recognition [12].…”
Section: Related Workmentioning
confidence: 99%
“…As discussed before, Velarde et al (2016) attain the highest accuracy prior to our study, but they use a computer vision approach that is difficult to interpret musically: they apply a Gaussian filter to images of piano roll scores, transform the resulting pixel data through linear discriminant analysis, and classify with a linear SVM. In follow-up work, Velarde, Cancino Chacón, Meredith, Weyde, and Grachten (2018) extend their approach to include image analysis of spectrograms, as well as classification with a k-nearest neighbour classifier; however, as before, their study differs in scope from our musicological investigation. Finally, Taminau et al (2010) deploy subgroup discovery, a descriptive rule learning technique that involves both predictive and descriptive induction.…”
Section: Accuracy Comparisons With Previous Studiesmentioning
confidence: 99%
“…We conclude there are significant musical differences between Haydn and Mozart string quartets, enabling less than 15% LOO error and the selection of similar models across folds. (Herlands et al, 2014) CV trials 0.80 3-grams model (Hontanilla et al, 2013) LOO 0.747 LDA + Linear SVM (Velarde et al, 2016) LOO 0.804 KNN + SVM ensemble (Velarde et al, 2018) LOO 0.748 Subgroup discovery (Taminau et al, 2010) LOO 0.730 Bayesian Logistic Regression (ours) LOO 0.8526…”
Section: Estimated Probability Of Composermentioning
confidence: 99%
“…The third approach is to feed the raw data into a neural network model and to learn meaningful feature representations directly from the data. While this approach is not new (e.g., [23]), most recent works on composer classification have adopted this approach by applying a convolutional neural network (CNN) to a piano roll-like representation of the data [24][25][26]. CNN models can be considered the current state-of-the-art in composer classification.…”
Section: Introductionmentioning
confidence: 99%