In response to the low accuracy of traditional methods for detecting surface defects in lithium batteries, and the problems of large model size and high computational complexity in current detection models, this article proposes a new lightweight LF-YOLOv4 model that enhances image feature fusion. Firstly, replace the CSPdarknet backbone network in YOLOv4 with a lightweight MobileNetv2 network, thereby greatly reducing the computational parameters of the network while ensuring the ability to extract features. Secondly, to further reduce the number of model parameters and computational complexity, and minimize potential accuracy loss as much as possible, an improved depthwise separable convolution (DSC-SE-HsId) in this article was studied, which replaced some ordinary convolutions in the Neck and Head networks. Finally, to compensate for the partial accuracy loss caused by lightweight operations, and also to fuse feature maps of different scales to obtain more complete feature information, a new lightweight adaptive spatial feature fusion module (LSE-ASFF) in this article was studied and embedded behind the existing path aggregation network. To verify the performance and widespread applicability of the improved model, we conducted tests using the self-built lithium battery surface defect dataset, and the steel surface defect dataset provided by Northeastern University. And the experiment results show that the improved model proposed in this article achieves the highest TOPSIS score in both experimental datasets. Among them, compared with YOLOv4 on the self-built dataset, our improved model not only increases mAP50 by 2.97%, reaching 97.83%, but also has model parameters of only 18.16% of the original model, FLOPs of only 13.87% of the original model, model size of only 21.02% of the original model, and model training time shortened by 30.67% compared to the original model. Lastly, the effectiveness and superiority of the improved model in this article are demonstrated through example analysis and comparison.