Among the basic elements of music, timbre is one of the most important elements of sound, and it is also the main basis for distinguishing one pronunciation from another. People usually have the ability to “listen and argue” because everyone’s pronunciation is different. However, the existing audio extraction technology has low efficiency and low accuracy. Therefore, this paper aims to discuss the algorithm that can make music timbre feature extraction more accurate and efficient. For audio signal feature extraction, this paper proposed an audio feature based on harmonic components to describe the harmonic structure information in the audio signal spectrum. The algorithm in this paper extracts timbre features from the sound data of Western musical instruments and national musical instruments and analyzes the recognition accuracy. The experimental results showed that the classification accuracy of the four feature extractors is above 92%, among which B has the worst effect, with an accuracy of 92.42%, and D has the best classification effect, with an accuracy of 99.15%, which shows that the feature extraction algorithm designed in this paper combined with the traditional feature extraction algorithm can achieve better results.