Aiming at the problem of accurately locating and identifying multi-scale and differently shaped apple leaf diseases from a complex background in natural scenes, this study proposed an apple leaf disease detection method based on an improved YOLOv5s model. Firstly, the model utilized the bidirectional feature pyramid network (BiFPN) to achieve multi-scale feature fusion efficiently. Then, the transformer and convolutional block attention module (CBAM) attention mechanisms were added to reduce the interference from invalid background information, improving disease characteristics’ expression ability and increasing the accuracy and recall of the model. Experimental results showed that the proposed BTC-YOLOv5s model (with a model size of 15.8M) can effectively detect four types of apple leaf diseases in natural scenes, with 84.3% mean average precision (mAP). With an octa-core CPU, the model could process 8.7 leaf images per second on average. Compared with classic detection models of SSD, Faster R-CNN, YOLOv4-tiny, and YOLOx, the mAP of the proposed model was increased by 12.74%, 48.84%, 24.44%, and 4.2%, respectively, and offered higher detection accuracy and faster detection speed. Furthermore, the proposed model demonstrated strong robustness and mAP exceeding 80% under strong noise conditions, such as exposure to bright lights, dim lights, and fuzzy images. In conclusion, the new BTC-YOLOv5s was found to be lightweight, accurate, and efficient, making it suitable for application on mobile devices. The proposed method could provide technical support for early intervention and treatment of apple leaf diseases.