Underwater object detection is highly complex and requires a high speed and accuracy. In this paper, an underwater target detection model based on YOLOv8 (SPSM-YOLOv8) is proposed. It solves the problems of high computational complexities, slow detection speeds and low accuracies. Firstly, the SPDConv module is utilized in the backbone network to replace the standard convolutional module for feature extraction. This enhances computational efficiency and reduces redundant computations. Secondly, the PSA (Polarized Self-Attention) mechanism is added to filter and enhance the polarization of features in the channel and spatial dimensions to improve the accuracy of pixel-level prediction. The SCDown (spatial–channel decoupled downsampling) downsampling mechanism is then introduced to reduce the computational cost by decoupling the space and channel operations while retaining the information in the downsampling process. Finally, MPDIoU (Minimum Point Distance-based IoU) is used to replace the CIoU (Complete-IOU) loss function to accelerate the convergence speed of the bounding box and improve the bounding box regression accuracy. The experimental results show that compared with the YOLOv8n baseline model, the SPSM-YOLOv8 (SPDConv-PSA-SCDown-MPDIoU-YOLOv8) detection accuracy reaches 87.3% on the ROUD dataset and 76.4% on the UPRC2020 dataset, and the number of parameters and amount of computation decrease by 4.3% and 4.9%, respectively. The detection frame rate reaches 189 frames per second on the ROUD dataset, thus meeting the high accuracy requirements for underwater object detection algorithms and facilitating lightweight and fast edge deployment.