In the research of green vegetation coverage in the field of remote sensing image segmentation, crop planting area is often obtained by semantic segmentation of images taken from high altitude. This method can be used to obtain the rate of cultivated land in a region (such as a country), but it does not reflect the real situation of a particular farmland. Therefore, this paper takes low-altitude images of farmland to build a dataset. After comparing several mainstream semantic segmentation algorithms, a new method that is more suitable for farmland vacancy segmentation is proposed. Additionally, the Strip Pooling module (SPM) and the Mixed Pooling module (MPM), with strip pooling as their core, are designed and fused into the semantic segmentation network structure to better extract the vacancy features. Considering the high cost of manual data annotation, this paper uses an improved ResNet network as the backbone of signal transmission, and meanwhile uses data augmentation to improve the performance and robustness of the model. As a result, the accuracy of the proposed method in the test set is 95.6%, mIoU is 77.6%, and the error rate is 7%. Compared to the existing model, the mIoU value is improved by nearly 4%, reaching the level of practical application.
Rice pests are one of the main factors affecting rice yield. The accurate identification of pests facilitates timely preventive measures to avoid economic losses. Some existing open source datasets related to rice pest identification mostly include only a small number of samples, or suffer from inter-class and intra-class variance and data imbalance challenges, which limit the application of deep learning techniques in the field of rice pest identification. In this paper, based on the IP102 dataset, we first reorganized a large-scale dataset for rice pest identification by Web crawler technique and manual screening. This dataset was given the name IP_RicePests. Specifically, the dataset includes 8248 images belonging to 14 categories. The IP_RicePests dataset was then expanded to include 14,000 images via ARGAN data augmentation technique to address the difficulties in obtaining large samples of rice pests. Finally, the parameters trained on the public image ImageNet dataset using VGGNet, ResNet and MobileNet networks were used as the initial values of the target data training network to achieve image classification in the field of rice pests. The experimental results show that all three classification networks combined with transfer learning have good recognition accuracy, among which the highest classification accuracy can be obtained on the IP_RicePests dataset via fine-tuning the parameters of the VGG16 network. In addition, following ARGAN data augmentation the dataset demonstrates high accuracy improvements in all three models, and fine-tuning the VGG16 network parameters obtains the highest accuracy in the augmented IP_RicePests dataset. It is demonstrated that CNN combined with transfer learning can employ the ARGAN data augmentation technique to overcome difficulties in obtaining large sample sizes and improve the efficiency of rice pest identification. This study provides foundational data and technical support for rice pest identification.
Fast, accurate, and non-destructive large-scale detection of sweet cherry ripeness is the key to determining the optimal harvesting period and accurate grading by ripeness. Due to the complexity and variability of the orchard environment and the multi-scale, obscured, and even overlapping fruit, there are still problems of low detection accuracy even using the mainstream algorithm YOLOX in the absence of a large amount of tagging data. In this paper, we proposed an improved YOLOX target detection algorithm to quickly and accurately detect sweet cherry ripeness categories in complex environments. Firstly, we took a total of 2400 high-resolution images of immature, semi-ripe, and ripe sweet cherries in an orchard in Hanyuan County, Sichuan Province, including complex environments such as sunny days, cloudy days, branch and leaf shading, fruit overlapping, distant views, and similar colors of green fruits and leaves, and formed a dataset dedicated to sweet cherry ripeness detection by manually labeling 36068 samples, named SweetCherry. On this basis, an improved YOLOX target detection algorithm YOLOX-EIoU-CBAM was proposed, which embedded the Convolutional Block Attention Module (CBAM) between the backbone and neck of the YOLOX model to improve the model’s attention to different channels, spaces capability, and replaced the original bounding box loss function of the YOLOX model with Efficient IoU (EIoU) loss to make the regression of the prediction box more accurate. Finally, we validated the feasibility and reliability of the YOLOX-EIoU-CBAM network on the SweetCherry dataset. The experimental results showed that the method in this paper significantly outperforms the traditional Faster R-CNN and SSD300 algorithms in terms of mean Average Precision (mAP), recall, model size, and single-image inference time. Compared with the YOLOX model, the mAP of this method is improved by 4.12%, recall is improved by 4.6%, F-score is improved by 2.34%, while model size and single-image inference time remain basically comparable. The method in this paper can cope well with complex backgrounds such as fruit overlap, branch and leaf occlusion, and can provide a data base and technical reference for other similar target detection problems.
The accurate segmentation of significant rice diseases and assessment of the degree of disease damage are the keys to their early diagnosis and intelligent monitoring and are the core of accurate pest control and information management. Deep learning applied to rice disease detection and segmentation can significantly improve the accuracy of disease detection and identification but requires a large number of training samples to determine the optimal parameters of the model. This study proposed a lightweight network based on copy paste and semantic segmentation for accurate disease region segmentation and severity assessment. First, a dataset for rice significant disease segmentation was selected and collated based on 3 open-source datasets, containing 450 sample images belonging to 3 categories of rice leaf bacterial blight, blast and brown spot. Then, to increase the diversity of samples, a data augmentation method, rice leaf disease copy paste (RLDCP), was proposed that expanded the collected disease samples with the concept of copy and paste. The new RSegformer model was then trained by replacing the new backbone network with the lightweight semantic segmentation network Segformer, combining the attention mechanism and changing the upsampling operator, so that the model could better balance local and global information, speed up the training process and reduce the degree of overfitting of the network. The results show that RLDCP could effectively improve the accuracy and generalisation performance of the semantic segmentation model compared with traditional data augmentation methods and could improve the MIoU of the semantic segmentation model by about 5% with a dataset only twice the size. RSegformer can achieve an 85.38% MIoU at a model size of 14.36 M. The method proposed in this paper can quickly, easily and accurately identify disease occurrence areas, their species and the degree of disease damage, providing a reference for timely and effective rice disease control.
IntroductionThe difficulties in tea shoot recognition are that the recognition is affected by lighting conditions, it is challenging to segment images with similar backgrounds to the shoot color, and the occlusion and overlap between leaves.MethodsTo solve the problem of low accuracy of dense small object detection of tea shoots, this paper proposes a real-time dense small object detection algorithm based on multimodal optimization. First, RGB, depth, and infrared images are collected form a multimodal image set, and a complete shoot object labeling is performed. Then, the YOLOv5 model is improved and applied to dense and tiny tea shoot detection. Secondly, based on the improved YOLOv5 model, this paper designs two data layer-based multimodal image fusion methods and a feature layerbased multimodal image fusion method; meanwhile, a cross-modal fusion module (FFA) based on frequency domain and attention mechanisms is designed for the feature layer fusion method to adaptively align and focus critical regions in intra- and inter-modal channel and frequency domain dimensions. Finally, an objective-based scale matching method is developed to further improve the detection performance of small dense objects in natural environments with the assistance of transfer learning techniques. Results and discussionThe experimental results indicate that the improved YOLOv5 model increases the mAP50 value by 1.7% compared to the benchmark model with fewer parameters and less computational effort. Compared with the single modality, the multimodal image fusion method increases the mAP50 value in all cases, with the method introducing the FFA module obtaining the highest mAP50 value of 0.827. After the pre-training strategy is used after scale matching, the mAP values can be improved by 1% and 1.4% on the two datasets. The research idea of multimodal optimization in this paper can provide a basis and technical support for dense small object detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.