In the process of vocal music learning, incorrect vocalization methods and excessive use of voice have brought many problems to the voice and accumulated a lot of inflammation, so that the level of vocal music learning stagnated or even declined. How to find a way to improve yourself without damaging your voice has become a problem that we have been pursuing. Therefore, it is of great practical significance for vocal music teaching in normal universities to conduct in-depth research and discussion on “pharyngeal singing.” Based on audio extraction, this paper studies the vocal music teaching pharyngeal training method. Different methods of vocal music teaching pharyngeal training have different times. When the recognition amount is 3, the average recognition time of vocal music teaching pharyngeal training based on data mining is 0.010 seconds, the average recognition time of vocal music teaching pharyngeal training based on Internet of Things is 0.011 seconds, and the average recognition time of vocal music teaching pharyngeal training based on audio extraction is 0.006 seconds. The recognition time of the audio extraction method is much shorter than that of the other two traditional methods, because the audio extraction method can perform segmented training according to the changing trend of physical characteristics of notes, effectively extract the characteristics of vocal music teaching pharyngeal training, and shorten the recognition time. The learning of “pharyngeal singing” in vocal music teaching based on audio extraction is different from general vocal music training. It has its unique theory, concept, law, and sound image. In order to “liberate your voice,” it adopts large-capacity and large-scale training methods.