ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
DOI: 10.1109/icassp43922.2022.9746549
|View full text |Cite
|
Sign up to set email alerts
|

A Lightweight Instrument-Agnostic Model for Polyphonic Note Transcription and Multipitch Estimation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(6 citation statements)
references
References 16 publications
0
6
0
Order By: Relevance
“…The proposed technique outperforms existing solutions in terms of low/high-frequency separation accuracy. Bittner et al (2022) introduced a lightweight neural network for instrument transcription that supports multiple vocal outputs, which can be extended to many instruments (including the human voice) in this study. The multi-output structure of our model increases frame-level note accuracy by simultaneously predicting frame-level onsets, multiple pitches, and note activations.…”
Section: Pitchmentioning
confidence: 99%
See 1 more Smart Citation
“…The proposed technique outperforms existing solutions in terms of low/high-frequency separation accuracy. Bittner et al (2022) introduced a lightweight neural network for instrument transcription that supports multiple vocal outputs, which can be extended to many instruments (including the human voice) in this study. The multi-output structure of our model increases frame-level note accuracy by simultaneously predicting frame-level onsets, multiple pitches, and note activations.…”
Section: Pitchmentioning
confidence: 99%
“…Coorevits et al (2019) again returned to the musical performance itself, examining the many effects that changes in musical tempo have on the 'performance state' or the articulation of the performer's movements. Bittner et al (2022) offered a way to understand musical tempo beyond listening to music by developing a Visual project and found that the number of notes per time unit and tempo also mattered. Pressnitzer et al (2000) concluded from a psychoacoustic perspective that psychoacoustic roughness increases when non-tonal orchestral timbres reduce musical tension.…”
Section: Tempomentioning
confidence: 99%
“…Indeed, all pieces included in the analyses included multiple pitch lines. For polyphonic pitch estimation, we used pretrained models based on neural networks implemented in the python library basicpitch [26]. For the model input, we calculated the mean of all estimated pitches at each time point in order to receive one aggregated pitch score.…”
Section: Plos Onementioning
confidence: 99%
“…The model is implemented in the python library midi-miner [27]. The midi input required for the estimation of tonal tension was created using the python package basic-pitch [26] that includes a state-of-the-art polyphonic pitch estimation method. According to the model by [11], tonal tension can be quantified using three metrics.…”
Section: Plos Onementioning
confidence: 99%
“…In the monophonic BWE system, the DDSP model generates a harmonic signal based on a single f 0 estimated from a monophonic pitch estimator [43]. Now that we are in a polyphonic context, we use a state-of-the-art multipitch estimator [45] which outputs a maximum of I different fundamental frequencies f i 0 . Considering that this multi-pitch estimator has a rather good performance, we propose to iteratively use the monophonic DDSP model DDSP-mono-dec in a cyclic manner, as illustrated in Fig.…”
Section: Cyclic Monophonic Decodermentioning
confidence: 99%