The first impression of the early historical recordings on the contemporary audience is often not ideal; the overall speed is relatively fast, the rhythm is relatively loose and free, and the strings are slippery. In recent years, a large number of research achievements have emerged at home and abroad that apply computer visualization analysis methods to music performance practice. It is feasible and necessary to gradually apply the visual audio parameter analysis method to music research and performance practice teaching. In this study, a prosody hierarchy prediction model based on the CNN (convective neural network) is proposed, and word vectors are added as semantic features. The DL (deep learning) method is applied to vocal music recognition, and a recognition method based on the DL framework is proposed by combining traditional audio signal processing methods. After introducing word vectors as features in the CNN model, the F-score value increased from 77% to 80%. The feasibility of the proposed vocal music recognition algorithm based on DL is verified by experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.