Detecting fingering of overblown flute sound using sparse feature learning

Han, Yoonchang; Lee, Kyogu

doi:10.1186/s13636-015-0079-0

Cited by 5 publications

(2 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…A mel-scale is based on the human auditory system and is approximately logarithmic above 1 kHz [23]. We used 128 mel-frequency bins following representation learning researches on music annotation [24], [25], musical instrument identification task [26], and fingering detection of overblown flute sound [27]; this is a reasonable size that sufficiently retain the original spectral characteristics, while significantly reducing the dimensionality of the data.…”

Section: A Audio Preprocessingmentioning

confidence: 99%

Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation

Han,

Lee

2016

Preprint

Self Cite

View full text Add to dashboard Cite

In recent years, neural network approaches have shown superior performance to conventional hand-made features in numerous application areas. In particular, convolutional neural networks (ConvNets) exploit spatially local correlations across input data to improve the performance of audio processing tasks, such as speech recognition, musical chord recognition, and onset detection. Here we apply ConvNet to acoustic scene classification, and show that the error rate can be further decreased by using delta features in the frequency domain. We propose a multiplewidth frequency-delta (MWFD) data augmentation method that uses static mel-spectrogram and frequency-delta features as individual input examples. In addition, we describe a ConvNet output aggregation method designed for MWFD augmentation, folded mean aggregation, which combines output probabilities of static and MWFD features from the same analysis window using multiplication first, rather than taking an average of all output probabilities. We describe calculation results using the DCASE 2016 challenge dataset, which shows that ConvNet outperforms both of the baseline system with hand-crafted features and a deep neural network approach by around 7%. The performance was further improved (by 5.7%) using the MWFD augmentation together with folded mean aggregation. The system exhibited a classification accuracy of 0.831 when classifying 15 acoustic scenes.

show abstract

Section: A Audio Preprocessingmentioning

confidence: 99%

Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation

Han,

Lee

2016

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Almeida, Chow, Smith ve Wolfe (2009: 1524 yaptıkları çalışmada, flütten ses elde etmede kullanılan parmak pozisyonlarının, Do/Do Diyez, Re, Re Diyez/Mi Bemol sesleri dışında 1. ve 2. oktavdaki diğer seslerde değişiklik göstermediğini ancak bazı flüt öğrencilerinin özellikle flüte başlangıç aşamasında yanlış davranışlar edinme ve dikkatsiz çalışma sonucu Do, Do Diyez, Re, Re Diyez seslerini 1. oktav parmak pozisyonları ile elde etmeye çalıştıklarını belirtmektedir. Diğer yandan, Han ve Lee (2016: 2)'e göre de, flüte başlangıç aşamasında olan öğrenciler, istem dışı da olsa flütteki parmak pozisyonlarını birbirlerine karıştırmakta ve kurallar doğrultusunda ses elde etmekten uzaklaşmaktadır. İyi bir müziksel işitme yeteneği ve dikkati ile rahatlıkla fark edilebilecek bu davranışlar, ne yazık ki özellikle flüte başlangıç aşamasında yanlış parmak pozisyonlarının kazanılmasına ve bu pozisyonların düzeltilme çabasının da çalgıya yönelik olumlu tutumların ortadan kalkmasına neden olabilmektedir.…”

Section: Introductionunclassified

Flute Instructors’ Views About the Fingering Positions of Flute Students at Fine Arts High Schools

Ataman¹

2017

Idil

View full text Add to dashboard Cite

TrumpetNet: A Convolutional Neural Network with Self-Attention Mechanisms for visual detection of trumpet fingering

Valdez-Rodríguez,

Rangel,

Moreno-Armendáriz

2024

IFS

View full text Add to dashboard Cite

Visual detection of fingering on the trumpet is an increasingly interesting topic in music research. The ability to recognize and track the movements of the trumpet player’s fingers during the performance of a musical piece can provide valuable information for analyzing and improving instrument technique. However, this is a largely unexplored task, as most works focus on audio quality rather than instrument fingering techniques. Developing techniques for identifying essential finger positions on a musical instrument is crucial, as poor fingering techniques can harm instrument performance. In this work, we propose the visual detection of this fingering using convolutional neural networks with a proprietary dataset created for this purpose. Additionally, to improve the results and focus on the essential parts of the instrument, we use self-attention mechanisms by extracting these features automatically.

show abstract

Detecting fingering of overblown flute sound using sparse feature learning

Cited by 5 publications

References 14 publications

Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation

Acoustic scene classification using convolutional neural network and multiple-width frequency-delta data augmentation

Flute Instructors’ Views About the Fingering Positions of Flute Students at Fine Arts High Schools

TrumpetNet: A Convolutional Neural Network with Self-Attention Mechanisms for visual detection of trumpet fingering

Contact Info

Product

Resources

About