The distribution of transmission lines is complex, influenced by terrain and topography, making their maintenance and management challenging. With the advancement of global mechanization, various types of construction machinery and large‐scale mechanical equipment are continually increasing. Accidental contact of construction machinery near transmission lines poses a serious threat to the stability of the power system. Traditional drones and inspection robots face difficulties in achieving real‐time monitoring of construction machinery and equipment around transmission lines, quickly identifying potential risks. Conventional image processing techniques and convolutional neural networks struggle with effective handling of small targets and densely packed detection tasks involving multiple targets. To address these challenges, this paper proposes an intelligent detection algorithm Swin Transformer attention efficient algorithm‐you only look once (YOLO) (STAE‐YOLO). It is based on the Swin Transformer global self‐attention mechanism, cross‐channel fusion attention mechanism, enhanced small object detection framework, and a focused and efficient regression loss function. The experimental results show that the STAE‐YOLO algorithm model improved by 6.3% in mean average precision, 3.7% in precision, and 3.1% in recall compared to the baseline model. Meanwhile, deploying the window multi‐head self‐attention global self‐attention mechanism in the model can strengthen the global multi‐scale semantic information in detection.