Eye-hand span, i.e., the distance between a performer’s fixation and execution of a note, has been regarded as a decisive indicator of performers’ competence in sight-reading. However, integrated perspectives regarding the relationship between eye-hand span and sight-reading variables have been less discussed. The present study explored the process of sight-reading in terms of three domains and their interrelations. The domain indicators included musical complexity and playing tempo (musical domain), eye-hand span (cognitive domain), and performance accuracy (behavioural domain). Thirty professional pianists sight-read four musical pieces with two different complexities and playing tempi. We measured the participants’ eye-hand span, evaluated their performance accuracy, and divided the participants into three groups according to their performance accuracy values. Interestingly, we found that the eye-hand span did not change solely based on the performance accuracy. In contrast, the relationship between the eye-hand span and performance accuracy changed according to the difficulty of the sight-reading task. Our results demonstrate that the eye-hand span is not a decisive indicator of sight-reading proficiency but is a strategy that can vary according to the difficulty of sight-reading tasks. Thus, proficient sight-readers are performers who are skilled at adjusting their eye-hand span instead of always maintaining an extended span.
Quantitative evaluation of piano performance is of interests in many fields, including music education and computational performance rendering. Previous studies utilized features extracted from audio or musical instrument digital interface (MIDI) files but did not address the difference between hands (DBH), which might be an important aspect of high-quality performance. Therefore, we investigated DBH as an important factor determining performance proficiency. To this end, 34 experts and 34 amateurs were recruited to play two excerpts on a Yamaha Disklavier. Each performance was recorded in MIDI, and handcrafted features were extracted separately for the right hand (RH) and left hand (LH). These were conventional MIDI features representing temporal and dynamic attributes of each note and computed as absolute values (e. g., MIDI velocity) or ratios between performance and corresponding scores (e. g., ratio of duration or inter-onset interval (IOI)). These note-based features were rearranged into additional features representing DBH by simple subtraction between features of both hands. Statistical analyses showed that DBH was more significant in experts than in amateurs across features. Regarding temporal features, experts pressed keys longer and faster with the RH than did amateurs. Regarding dynamic features, RH exhibited both greater values and a smoother change along melodic intonations in experts that in amateurs. Further experiments using principal component analysis (PCA) and support vector machine (SVM) verified that hand-difference features can successfully differentiate experts from amateurs according to performance proficiency. Moreover, existing note-based raw feature values (Basic features) and DBH features were tested repeatedly via 10-fold cross-validation, suggesting that adding DBH features to Basic features improved F1 scores to 93.6% (by 3.5%) over Basic features. Our results suggest that differently controlling both hands simultaneously is an important skill for pianists; therefore, DBH features should be considered in the quantitative evaluation of piano performance.
Recent deep learning approaches for melody harmonization have achieved remarkable performance by overcoming the uneven chord distributions of music data. However, most of these approaches have not attempted to capture an original melodic structure and generate structured chord sequences with appropriate rhythms. Hence, we use a Transformer-based architecture that directly maps lower-level melody notes into a semantic higher-level chord sequence. In particular, we encode the binary piano roll of a melody into a note-based representation. Furthermore, we address the flexible generation of various chords with Transformer expanded with a VAE framework. We propose three Transformer-based melody harmonization models: 1) the standard Transformer-based model for the neural translation of a melody to chords (STHarm), 2) the variational Transformer model for learning the global representation of complete music (VTHarm), and 3) the regularized variational Transformer model for the controllable generation of chords (rVTHarm). Experimental results demonstrate that the proposed models generate more structured, diverse chord sequences than LSTM-based models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.