With high-tech advancements, intelligent, sustainable development has become widespread in daily life. However, due to developmental differences among various regions, continuity in English language teaching can be challenging. The goal of teaching in the context of sustainable development is to tailor learning plans for students through intelligent intervention. In this paper, we address the issues of classifying students’ interests and jointly assessing the listening, reading, and writing modules in online English teaching. Our results demonstrate that an autoencoder can accurately recognize students’ interests in the four modules, with a recognition accuracy as high as 93.1%. Additionally, the mean squared error (MSE) between the comprehensive assessment and the teacher’s given grade under GRUs is only 0.63, significantly outperforming other RNN-type methods. Therefore, the proposed framework in this paper is crucial in promoting future research development in the sustainable development of English teaching intelligence and the problems of multi-module assessment problem and multi-information integration.