To improve the effect of English learning in the context of smart education, this study combines speech coding to improve the intelligent speech recognition algorithm, builds an intelligent English learning system, combines the characteristics of human ears, and studies a coding strategy of a psychoacoustic masking model based on the characteristics of human ears. Moreover, this study analyzes in detail the basic principles and implementation process of the psychoacoustic model coding strategy based on the characteristics of the human ear and completes the channel selection by calculating the masking threshold. In addition, this study verifies the effectiveness of the algorithm in this study through simulation experiments. Finally, this study builds a smart speech recognition system based on this model and uses simulation experiments to verify the effect of smart speech recognition on English learning. To improve the voice recognition effect of smart speech, band-pass filtering and envelope detection adopt the gammatone filter bank and Meddis inner hair cell model in the mathematical model of the cochlear system; at the same time, the masking effect model of psychoacoustics is introduced in the channel selection stage to prevent noise. Sex has been improved, and the recognition effect of smart voice has been improved. The analysis shows that the intelligent speech recognition system proposed in this study can effectively improve the effect of English learning. In particular, it has a great effect on improving the effect of oral learning.