Although fast adversarial training provides an efficient approach for building robust networks, it may suffer from a serious problem known as catastrophic overfitting (CO), where the multi-step robust accuracy suddenly collapses to zero. In this paper, we for the first time decouple the FGSM examples into data-information and self-information, which reveals an interesting phenomenon called "self-fitting". Self-fitting, i.e., DNNs learn the self-information embedded in single-step perturbations, naturally leads to the occurrence of CO. When self-fitting occurs, the network experiences an obvious "channel differentiation" phenomenon that some convolution channels accounting for recognizing self-information become dominant, while others for data-information are suppressed. In this way, the network learns to only recognize images with sufficient self-information and loses generalization ability to other types of data. Based on self-fitting, we provide new insight into the existing methods to mitigate CO and extend CO to multi-step adversarial training. Our findings reveal a self-learning mechanism in adversarial training and open up new perspectives for suppressing different kinds of information to mitigate CO.Impact Statement-Fast adversarial training is an effective and efficient adversarial training method. However, it is prone to instability and can lead to catastrophic overfitting (CO). In this paper, we have revealed for the first time the existence of model self-information in adversarial examples, and argue that fitting self-information (self-fitting) is one of the factors that contribute to CO. Our findings can further aid in the understanding of CO in fast adversarial training and even multi-step adversarial training, inspiring the generation of more stable and efficient adversarial training algorithms. The discovery of self-fitting is not only for adversarial attack but also helpful other methods that involved network information, like curriculum learning, active learning, and self-supervised learning.