Overhead transmission line detection based on deep learning of aerial images taken by UAVs has been widely investigated. Despite its success, it is limited by several factors, including inappropriate evaluation criteria and dramatic scaling of components in the images. To mitigate these issues, a relative mean Average Precision evaluation index is proposed to accurately measure the model's detection performance for smaller objects. A data enhancement strategy including multi‐scale transformation is adopted to alleviate the problem of drastic scaling. The existing Cascade RCNN target detection technology is enhanced by incorporating Swin‐v2 and a balanced feature pyramid to improve feature characterization capabilities, while side‐aware boundary localization is utilized to improve the positioning accuracy of the model. Experimental results demonstrate that the proposed method outperforms state‐of‐the‐art methods on CPLID and achieves 7.8%, 11.8%, and 5.5% higher detection accuracy than the baseline for mAP50, relative small and medium mAP, respectively. Additionally, the paper discusses the influence of adopted data enhancement on the robustness of the model.