It is combined the digital twin technology to construct the English classroom situational teaching mode. The system uses advanced virtual reality technology and computer image technology and combines with video and audio synchronization processing technology to provide a new set of methods for students’ language learning. The scene interactive teaching system’s graphics rendering server produce and create 3D virtual scenes or actual photos in real time. Furthermore, according to the English classroom teaching situation, this paper constructs the functional modules of the situational teaching system, conducts an in-depth analysis of the system implementation methods, expresses the system core algorithm flow in the form of diagrams and tables, and obtains the overall system framework. Finally, it is evaluated the effect of the English classroom situational teaching model proposed in this paper through experimental research. From the experimental results, it can be seen that the teaching model proposed in this paper is very effective.