Accurately identifying tropical fish serves as a crucial indicator, offering an insight into the state of marine biodiversity and the condition of coral reef ecosystems. However, the current detection networks are prone to omission and misidentification due to occlusion between fish and the complex underwater environment. This paper proposes a modified approach named FishFocusNet, in which alterable kernel convolution modules, asymptotic feature pyramid network (AFPN), and Shape‐IoU are integrated into YOLOv8. To extract a more comprehensive set of fish features, AKConv modules with arbitrary kernel sizes are proposed to take the place of the conventional fixed‐shaped kernels in the backbone for downsampling. AFPN is adopted as the feature integration structure in the neck, which enhances feature fusion and adaptive spatial fusion between non‐adjacent layers. In the detector head, Shape‐IoU is employed to achieve precise localization of fish targets. The superiorities of these modifications are proved by ablation experiments and comparative experiments. The experimental results show that the optimized approach obtained an mAP of 81.8% accompanied by 2.4 MB parameters and 3.6 GB FLOPS. Meanwhile, compared with more complicated models of similar scale, the proposed method can enhance recognition accuracy to 84.2% and significantly reduce computational costs.