In order to further investigate the network multi-modal learning environment to promote students’ ability to English audio-visual effects and improve the quality of teaching effect, an empirical analysis of audio-visual teaching and network multi-modal learning environment theory for English majors was proposed in the research. The 98 students were chosen as the experiment objects from two classes in the same school. They were divided into the experiment class and the control class. The students in the experiment class were taught according to the teaching model based on network multi-modal learning environment theory. SPSS19.0 software was used to analyze the scores before and after the experiment. After 3 months of experiment teaching, the average score of the post-test of the experiment class was 13.65 points, 2.23 points higher than 11.42 points before the test. The average score of the control class was 11.08 points, 0.1 points lower than 11.18 points before the test. It could be seen that the teaching method based on a network multi-modal learning environment was more conducive to improving the audio-visual performance and the teaching effectiveness of English majors.