Cross-site scripting (XSS) is an important issue in the field of network security, and there have been many studies on XSS detection models. However, the emergence of XSS adversarial attack samples has affected the detection accuracy of these models. Therefore, this paper proposes a reinforcement learning-based XSS adversarial attack model, which aims to generate XSS adversarial attack samples and consists of a detection module and an adversarial attack module. In the detection stage, the detection module cleans the original XSS script data and vectorizes it using Word2Vec. Then, the processed data is input into the XSS detection model of the detection module. The detection module uses an ensemble learning method to construct an XSS detection model by combining LSTM, MLP, and SVM to form a higher accuracy model. Finally, the detection module obtains the classification result of the original XSS data as a preparation for the next escape stage. In the escape stage, the adversarial attack module is designed using the reinforcement learning algorithm TD3, and uses an adversarial generation module to generate legitimate adversarial samples that can bypass the XSS detection model. Experimental results show that the XSS adversarial attack samples generated based on TD3 have an evasion rate nearly 6% higher than those based on Soft-Q learning. This model provides a new idea and method for improving the accuracy of XSS detection models and providing more valuable XSS attack data samples.