The safe operation of transmission lines will be an important guarantee for domestic economic construction, and it is urgent to improve the fault diagnosis and identification ability of electrical components in transmission lines. In order to realize electrical components identification and thermal defect detection from massive aerial images, in this study, a cascaded detection method is proposed based on infrared images and YOLO model. Firstly, two infrared datasets used for classification and localization are created, totaling 4887 infrared images. Secondly, to enhance the accuracy and robustness of electrical components identification, similarity-based attention mechanism modules, cross-level weighted feature pyramid network, and Wise IoU are introduced to the original YOLOv7. Finally, the improved YOLOv7 model and comparative models are trained and then tested on the infrared datasets. The mAP of the improved model reaches 97.4%, which is 6% higher than that of the original YOLOv7. More importantly, by cascading the improved YOLOv7 and YOLOv7-tiny for thermal defect detection, the AP value of the proposed method (87.91%) is more than 20% higher than that of YOLOv7 (67.17%). The experimental results show that the cascaded model is superior to mainstream object detection models in electrical components identification and thermal defect detection, and it is expected to be deployed on embedded devices for real-time inspection of transmission lines.