As one of the typical application-oriented solutions to robot autonomous navigation, visual simultaneous localization and mapping is essentially restricted to simplex environmental understanding based on geometric features of images. By contrast, the semantic simultaneous localization and mapping that is characterized by high-level environmental perception has apparently opened the door to apply image semantics to efficiently estimate poses, detect loop closures, build 3D maps, and so on. This article presents a detailed review of recent advances in semantic simultaneous localization and mapping, which mainly covers the treatments in terms of perception, robustness, and accuracy. Specifically, the concept of “semantic extractor” and the framework of “modern visual simultaneous localization and mapping” are initially presented. As the challenges associated with perception, robustness, and accuracy are being stated, we further discuss some open problems from a macroscopic view and attempt to find answers. We argue that multiscaled map representation, object simultaneous localization and mapping system, and deep neural network-based simultaneous localization and mapping pipeline design could be effective solutions to image semantics-fused visual simultaneous localization and mapping.
The grade of wheat quality depends on the proportion of unsound kernels. Therefore, the rapid detection of unsound wheat kernels is important for wheat rating and evaluation. However, in practice, unsound kernels are hand-picked, which makes the process time-consuming and inefficient. Meanwhile, methods based on traditional image processing cannot divide adherent particles well. To solve the above problems, this paper proposed an unsound wheat kernel recognition algorithm based on an improved mask RCNN. First, we changed the feature pyramid network (FPN) to a bottom-up pyramid network to strengthen the low-level information. Then, an attention mechanism (AM) module was added between the feature extraction network and the pyramid network to improve the detection accuracy for small targets. Finally, the regional proposal network (RPN) was optimized to improve the prediction performance. Experiments showed that the improved mask RCNN algorithm could identify the unsound kernels more quickly and accurately while handling adhesion problems well. The precision and recall were 86% and 91%, respectively, and the inference time on the test set with about 200 targets for each image was 7.83 s. Additionally, we compared the improved model with other existing segmentation models, and experiments showed that our model achieved higher accuracy and performance than the other models, laying the foundation for wheat grading.
In the detection of global wheat heads, it is easy to give rise to difficulties due to different wheat varieties, planting densities and growth periods of wheat plants in different countries. In addition, the illumination conditions of the image collection and the complex background of field will also reduce the detection accuracy. It is also hard to accurately detect targets that are occluded and partially displayed in the image. To solve the above problems, in this paper, an improved YOLOv5 algorithm that integrates separable convolution and attention mechanisms is proposed. Firstly, the number of CSP modules of YOLOv5 is reduced to shrink memory consumption. Subsequently, vanilla convolutions in the CSP are replaced by separable convolutions which is also added to the fusion path and to reduce the redundant information of the feature map, so as to reduce the complexity of the model. In addition, the co-attention mechanism is added in backbone. Finally, the feature fusion module was adjusted to make the high-level features fuse more low-level information. Compared with the original algorithm, results show that the mAP of the improved algorithm reaches 93.8% which is 4.2% higher than that of the YOLOv5 algorithm, and the FPS is 27.4 which is 1.3 higher than YOLOv5. YOLOv7 is emphatically compared during model evaluation, other YOLO series and mainstream detection algorithms are also compared, and results show that our model has the best inference time and the best accuracy when dealing with high pixel images.
Deep belief network (DBN) is now being recognized as a powerful and eminently practical tool for large scale data processing. The main characteristics of DBN are the feature extension from low-level content to high-level data association and the representation of joint distribution between original data and matched labels. For a wheeled robot with no other available location reference supports, the internally integrated inertial measurement units (IMUs) essentially requires the robot to be able to implement efficient fault diagnosis to locate and identify the faults, especially for the accumulated error caused by large drifts of gyroscopes. An optimized DBN based fault diagnosis design is proposed to deal with such faults with complexity and diversity. The highlights of the proposed DBN model lies in its combination of weight value optimization via an inexact LSA-GA (abbreviates 'inexact linear searching algorithm-genetic algorithm') and dynamic adjustment for hidden-layer neurons of constituent RBMs (abbreviates 'restricted Boltzmann machines'). The problems associated with DBN anatomy, bat algorithm (BA) description and fault diagnosis modeling are discussed in detail. The real robot platform experiments and dataset tests are conducted. The results indicate that, the optimized DBN design leads to a better fault classification with excellent generalization ability on given datasets, and the adjustable 'DBN structure' contributes to the data association extraction between multiples of fault categories. The proposed scheme may therefore be considered to provide preferred reference models for a class of data based fault diagnosis problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.