When ships navigate in polar regions, they may collide with ice masses, which may cause structural damage and endanger the safety of their occupants. Therefore, it is essential to promptly detect sea ice, icebergs, and passing ships. However, individual data sources have limits and should be combined and integrated to obtain more thorough information. A polar multi-target local-scale dataset with five categories was constructed. Sea ice, icebergs, ice melt ponds, icebreakers, and inter-ice channels were identified by a single-shot detector (SSD), with a final mAP value of 70.19%. A remote sensing sea ice dataset with 15,948 labels was constructed. The You Only Look Once (YOLOv5) model was improved with Squeeze-and-Excitation Networks (SE), Funnel Activation (FReLU), Fast Spatial Pyramid Pooling, and Cross Stage Partial Network (SPPCSPC-F). In the detection stage, a slicing operation was performed on remote sensing images to detect small targets. Simulated sea ice data were included to verify the model’s generalization ability. Then, the improved model was trained and evaluated in an ablation experiment. The mAP, recall (R), and precision (P) values of the improved YOLOv5 were 75.3%, 70.3, and 75.4%, with value increases of 3.5%, 3.4%, and 1.9%, respectively, compared to the original model. The improved YOLOv5 was also compared with other models such as YOLOv3, Faster-RCNN, and YOLOv4-tiny. The results indicated that the performance of the proposed model surpassed those of the other conventional models. This study achieved the detection of multiple targets on different scales in a polar region and realized data fusion, avoiding the limitations of using a single data source, and provides a method to support polar ship path planning.