With the economic globalization continuous growth of China’s socioeconomic level tends to be internationalized, China’s attention to English has been significantly improved. However, the domestic English teaching level is limited, so it is impossible to correct students’ English pronunciation and make a reasonable evaluation at all times so that oral training has certain disadvantages. However, the computer-aided language learning system at home and abroad focuses on the practice of words and grammar, and the evaluation indicators are less and not comprehensive. In view of the complexity of English pronunciation changes, traditional speech recognition is difficult to recognize speech speed and improve its accuracy. Furthermore, to strengthen the English pronunciation of domestic students, a nonlinear network structure is studied in depth to simulate the human brain to analyze a model of speech recognition is established Mel frequency cepstrum characteristic parameters of human ear model and deep belief network. In this paper, the traditional computer pronunciation evaluation method is improved in an all-round way, and a set of high-quality speech recognition system of speech recognition method is constructed. Aiming at the above problems, it takes the students as the research, which proves that the method adopted in this paper can give the learners accurate pronunciation quality analysis report and guidance and correct their intonation and improve the learning effect, and the experimental data verify that the improved speech recognition system model recognition ability is higher than the traditional model.