China's education modernization requirements clearly suggest exploring new teaching methods to improve teaching effectiveness. Based on this, this article investigates AR construction technology as a blended English teaching model in the university school environment. Mobile terminals are used to build the blended English teaching model, modeling tools are used to achieve split modeling, and the mobile terminals themselves are used to achieve the management and network synchronization work for the cloud classroom. For the gesture recognition model construction, 3D convolutional neural networks are used to separate and optimize the parameters. Finally, experiments are designed to simulate and analyze the AR construct of the blended English teaching model to determine accuracy. The simulation results show that the model can realize the simultaneous display of teacher explanation, and the improved algorithm of gesture recognition is improved in recognition rate, which can improve the effectiveness of blended English teaching.