Composer style classification of piano sheet music images using language model pretraining

Tsai, Timothy; Ji, Kevin

doi:10.5281/zenodo.4245398

2020

DOI: 10.5281/zenodo.4245398

|View full text |Cite

Composer style classification of piano sheet music images using language model pretraining

Timothy Tsai

Kevin Ji

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2021

Publication Types

Select...

Other3

Relationship

Self Cite0

Independent3

Authors

Journals

Cited by 3 publications

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Instrument Classification of Solo Sheet Music Images

Yang

Tsai

2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

This paper studies instrument classification of solo sheet music. Whereas previous work has focused on instrument recognition in audio data, we instead approach the instrument classification problem using raw sheet music images. Our approach first converts the sheet music image into a sequence of musical words based on the bootleg score representation, and then treats the problem as a text classification task. We show that it is possible to significantly improve classifier performance by training a language model on unlabeled data, initializing a classifier with the pretrained language model weights, and then finetuning the classifier on labeled data. In this work, we train AWD-LSTM, GPT-2, and RoBERTa models on solo sheet music images from IMSLP for eight different instruments. We find that GPT-2 and RoBERTa slightly outperform AWD-LSTM, and that pretraining increases classification accuracy for RoBERTa from 34.5% to 42.9%. Furthermore, we propose two data augmentation methods that increase classification accuracy for RoBERTa by an additional 15%.

show abstract

Instrument Classification of Solo Sheet Music Images

Yang

Tsai

2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

show abstract

MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding

Chou¹,

Chen²,

Chang³

et al. 2021

Preprint

View full text Add to dashboard Cite

This paper presents an attempt to employ the mask language modeling approach of BERT to pre-train a 12-layer Transformer model over 4,166 pieces of polyphonic piano MIDI files for tackling a number of symbolic-domain discriminative music understanding tasks. These include two note-level classification tasks, i.e., melody extraction and velocity prediction, as well as two sequence-level classification tasks, i.e., composer classification and emotion classification. We find that, given a pretrained Transformer, our models outperform recurrent neural network based baselines with less than 10 epochs of fine-tuning. Ablation studies show that the pre-training remains effective even if none of the MIDI data of the downstream tasks are seen at the pre-training stage, and that freezing the self-attention layers of the Transformer at the fine-tuning stage slightly degrades performance. All the five datasets employed in this work are publicly available, as well as checkpoints of our pre-trained and fine-tuned models. As such, our research can be taken as a benchmark for symbolic-domain music understanding.

show abstract

DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models

Sarmento¹,

Kumar²,

CJ³

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Composer style classification of piano sheet music images using language model pretraining

Cited by 3 publications

References 0 publications

Instrument Classification of Solo Sheet Music Images

Instrument Classification of Solo Sheet Music Images

MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding

DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models

Contact Info

Product

Resources

About