With the development of artificial intelligence, machine learning algorithms and deep learning algorithms are widely applied to attack detection models. Adversarial attacks against artificial intelligence models become inevitable problems when there is a lack of research on the cross-site scripting (XSS) attack detection model for defense against attacks. It is extremely important to design a method that can effectively improve the detection model against attack. In this paper, we present a method based on reinforcement learning (called RLXSS), which aims to optimize the XSS detection model to defend against adversarial attacks. First, the adversarial samples of the detection model are mined by the adversarial attack model based on reinforcement learning. Secondly, the detection model and the adversarial model are alternately trained. After each round, the newly-excavated adversarial samples are marked as a malicious sample and are used to retrain the detection model. Experimental results show that the proposed RLXSS model can successfully mine adversarial samples that escape black-box and white-box detection and retain aggressive features. What is more, by alternately training the detection model and the confrontation attack model, the escape rate of the detection model is continuously reduced, which indicates that the model can improve the ability of the detection model to defend against attacks.