The timbre and volume of a single tone are among its fundamental characteristics. The single-tone detection technology is the key to the foundation of MNFR (musical note feature recognition), which is built on the fundamental feature extraction of single tones. A MNFR method based on LSTM (long short-term memory) is proposed because traditional methods have low accuracy in note feature classification and low accuracy in MNFR. To process the series of convolution feature maps, the feature maps are directly input into LSTM to learn hash codes. Extract the note features. Segment according to the changing trend of the physical features of the notes. Additionally, a number of feature maps are built from convolution feature maps extracted from numerous convolution layers of previously trained CNNs (convolutional neural networks), taking into account the spatial specifics and semantic features. The note start vector is produced using the enhanced peak extraction algorithm based on Gaussian kernel smoothing. The findings indicate that, when 100 samples are used, this method’s note classification accuracy differs from that of the DBN (deep belief network) and DWT (discrete wavelet transform) by 1.17 percent and 2.04 percent, respectively. The analysis in the conclusion demonstrates that the algorithm put forth in this paper is both theoretically and practically workable.