This article proposes a lightweight underwater biological target detection network based on the improvement of YOLOXs, addressing the challenges of complex and dynamic underwater environments, limited memory in underwater devices, and constrained computational capabilities. Firstly, in the backbone network, GhostConv and GhostBottleneck are introduced to replace standard convolutions and the Bottleneck1 structure in CSPBottleneck_1, significantly reducing the model’s parameter count and computational load, facilitating the construction of a lightweight network. Next, in the feature fusion network, a Contextual Transformer block replaces the 3 × 3 convolution in CSPBottleneck_2. This enhances self-attention learning by leveraging the rich context between input keys, improving the model’s representational capacity. Finally, the positioning loss function Focal_EIoU Loss is employed to replace IoU Loss, enhancing the model’s robustness and generalization ability, leading to faster and more accurate convergence during training. Our experimental results demonstrate that compared to the YOLOXs model, the proposed YOLOXs-GCE achieves a 1.1% improvement in mAP value, while reducing parameters by 24.47%, the computational load by 26.39%, and the model size by 23.87%. This effectively enhances the detection performance of the model, making it suitable for complex and dynamic underwater environments, as well as underwater devices with limited memory. The model meets the requirements of underwater target detection tasks.