The accuracy improvement of deep learning-based damage identification methods has always been pursued. To this end, this study proposes a novel damage identification method using Swin Transformer and continuous wavelet transform (CWT). Specifically, the original structural vibration data is first transferred to a time-frequency diagram by CWT, thereby capturing the characteristic information of structural damage. Secondly, the Swin Transformer is applied to learn the two-dimensional time-frequency diagram layer by layer and extract the damage information, by which the damage identification is achieved. Then, the identification accuracy of the proposed method is analyzed under various sample lengths and different levels of environmental noise to validate the robustness of this approach. Finally, the practicality of this method is verified through laboratory test. The results show the proposed method can effectively recognize the damage and achieve excellent accuracy even under noise interference. Its accuracy reaches 99.6% and 99.0% under single damage and multiple damage scenarios, respectively.