Due to the limited light penetration in underwater environments, sonar equipment plays a crucial role in various commercial and military operations. However, underwater images often suffer from degradation due to scattering and absorption phenomena, resulting in poor visibility of submerged objects. To address this challenge, image enhancement techniques are essential for enhancing the appearance and visibility of underwater objects. This research proposes a novel approach called HLAST-ACNet, which combines the advantages of a hybrid Local Acuity Swin Transformer and an Adapted Coat-Net for Underwater Object Detection (UOD). The HLASwin-T-ACoat-Net leverages Contrast Limited Adaptive Histogram Equalization (CLAHE) to increase the quality of images. Additionally, it incorporates a path aggregation network to integrate deep and shallow feature maps and utilizes online complicated example mining to improve training efficiency. Furthermore, the algorithm improves Region of Interest (ROI) pooling by introducing ROI alignment, which mitigates quantization errors and enhances object detection accuracy. Compared to existing algorithms, the algorithms based on HLASTACNet demonstrate significant improvements in the URPC2018 and OUC datasets, achieving precision rates of 91.25% and 92.36%, respectively. The research model has a higher computational complexity than four existing methods, as evidenced by its GFLOPs, per-image processing time with a speed of 20ms, and the FPS measures for average processed frames per second reaching 2.28s. The research model effectively addressed the challenges and false detection with varying sizes of objects in complicated underwater environments.