The multimedia-assisted teaching model has entered the practice of English teaching (ET) in colleges and universities and has greatly influenced English teaching. In this paper, we conduct an empirical study on multimedia-assisted English teaching. Firstly, we propose a neural machine translation model based on multi-granularity features combined with dynamic word vectors to improve the problem of inaccurate English translation. Secondly, we propose a grammatical error correction model based on a generative adversarial network to correct grammatical errors in teaching English. The experimental results show that the improved model obtains 35.58 and 45.71 BLEU values in the experiments by comparing the neural machine translation models, and the translation accuracy is improved by 23.9% compared with the traditional model. In the experiments through multimedia-assisted teaching, the multimedia-assisted teaching model improved student performance by 9.8%, while the traditional teaching model improved by only 1.9%. The multimedia-assisted teaching model proposed in this paper has a positive effect on students’ performance, intercultural communication (IC) awareness, learning initiative, and interactivity, and also provides a valuable reference for multimedia-assisted language teaching.