As the demands for international marine transportation increase rapidly, effective port management has become an important issue. Automatic ship recognition can facilitate the realization of smart ports, and improve the efficiency of port operation and management. In order to take into account the processing efficiency and detection accuracy at the same time, the study presented an improved deep-learning network based on You only look once version 3 (Yolov3) for all-day ship detection with visible and infrared images. Yolov3 network can simultaneously improve the recognition ability of large and small objects through multiscale feature-extraction architecture. Considering reducing computational time and network complexity with relatively competitive detection accuracy, the study modified the architecture of Yolov3 by choosing an appropriate input image size, fewer convolution filters, and detection scales. In addition, the reduced Yolov3 was further modified with the spatial pyramid pooling (SPP) module to improve the network performance in feature extraction. Therefore, the proposed modified network can achieve the purpose of multi-scale, multi-type, and multi-resolution ship detection. In the study, a common self-built data set was introduced, aiming to conduct all-day and real-time ship detection. The data set included a total of 5557 infrared and visible light images from six common ship types in northern Taiwan ports. The experimental results on the data set showed that the proposed modified network architecture achieved acceptable performance in ship detection, with the mean average precision (mAP) of 93.2%, processing 104 frames per second (FPS), and 29.2 billion floating point operations (BFLOPs). Compared with the original Yolov3, the proposed method can increase mAP and FPS by about 5.8% and 8%, respectively, while reducing BFLOPs by about 47.5%. Furthermore, the computational efficiency and detection performance of the proposed approach have been verified in the comparative experiments with some existing convolutional neural networks (CNNs). In conclusion, the proposed method can achieve high detection accuracy with lower computational costs compared to other networks.