Nowadays, there are more and more researches on the application of natural language processing technology in computer-aided language system, which can provide a good assistant role for foreign language learners. However, in the research of computer-aided language system, there are still some deficiencies in the recognition of English spoken stress nodes, which cannot be well recognized. Based on this, this paper proposes a method of English spoken accent recognition based on natural language processing and endpoint detection algorithm, which aims to promote the accuracy of accent recognition in the computer-aided language system and improve the performance of the computer-aided language system. In order to avoid the interference of background noise, this paper proposes a short-term time-frequency endpoint detection algorithm which can accurately judge the beginning and end of speech in complex environment. Then, on the basis of traditional speech feature extraction and fractal dimension theory, a nonlinear fractal dimension speech feature is extracted. Finally, RankNet is used to process the extracted features to realize the recognition of English spoken stress nodes. In the simulation analysis, the application effect of the short-term time-frequency endpoint detection algorithm proposed in this paper in the complex background noise and the effect of non-linear fractal dimension speech features on the recognition of English spoken stress nodes are verified. Finally, the performance and good application effect of the method designed in this paper are illustrated.