Aiming at the severe occlusion problem and the tiny-scale object problem in the multi-fitting detection task, the Scene Knowledge Integrating Network (SKIN), including the scene filter module (SFM) and scene structure information module (SSIM) is proposed. Firstly, the particularity of the scene in the multi-fitting detection task is analyzed. Hence, the aggregation of the fittings is defined as the scene according to the professional knowledge of the power field and the habit of the operators in identifying the fittings. So, the scene knowledge will include global context information, fitting fine-grained visual information and scene structure information. Then, a scene filter module is designed to learn the global context information and fitting fine-grained visual information, and a scene structure module is designed to learn the scene structure information. Finally, the scene semantic features are used as the carrier to integrate three categories of information into the relative scene features, which can assist in the recognition of the occluded fittings and the tiny-scale fittings after feature mining and feature integration. The experiments show that the proposed network can effectively improve the performance of the multi-fitting detection task compared with the Faster R-CNN and other state-of-the-art models. In particular, the detection performances of the occluded and tiny-scale fittings are significantly improved.