Robot grasping technology is a hot spot in robotics research. In relatively fixed industrialized scenarios, using robots to perform grabbing tasks is efficient and lasts a long time. However, in an unstructured environment, the items are diverse, the placement posture is random, and multiple objects are stacked and occluded each other, which makes it difficult for the robot to recognize the target when it is grasped and the grasp method is complicated. Therefore, we propose an accurate, real‐time robot grasp detection method based on convolutional neural networks. A cascaded two‐stage convolutional neural network model with course to fine position and attitude was established. The R‐FCN model was used as the extraction of the candidate frame of the picking position for screening and rough angle estimation, and aiming at the insufficient accuracy of the previous methods in pose detection, an Angle‐Net model is proposed to finely estimate the picking angle. Tests on the Cornell dataset and online robot experiment results show that the method can quickly calculate the optimal gripping point and posture for irregular objects with arbitrary poses and different shapes. The accuracy and real‐time performance of the detection have been improved compared to previous methods.
With the rapid development of sensor technology and artificial intelligence, the video gesture recognition technology under the background of big data makes human-computer interaction more natural and flexible, bringing the richer interactive experience to teaching, on-board control, electronic games etc. To perform robust recognition under the conditions of illumination change, background clutter, rapid movement, and partial occlusion, an algorithm based on multi-level feature fusion of two-stream convolutional neural network is proposed, which includes three main steps. Firstly, the Kinect sensor obtains redgreen-blue-depth (RGB-D) images to establish a gesture database. At the same time, data enhancement is performed on the training set and test set. Then, a model of multi-level feature fusion of a two-stream convolutional neural network is established and trained. Experiments show that the proposed network model can robustly track and recognise gestures under complex backgrounds (such as similar complexion, illumination changes, and occlusion), and compared with the single-channel model, the average detection accuracy is improved by 1.08%, and mean average precision is improved by 3.56%.
With the rapid development of sensor technology and artificial intelligence, the video gesture recognition technology under the background of big data makes human-computer interaction more natural and flexible, bringing richer interactive experience to teaching, on-board control, electronic games, etc. In order to perform robust recognition under the conditions of illumination change, background clutter, rapid movement, partial occlusion, an algorithm based on multi-level feature fusion of two-stream convolutional neural network is proposed, which includes three main steps. Firstly, the Kinect sensor obtains RGB-D images to establish a gesture database. At the same time, data enhancement is performed on training and test sets. Then, a model of multi-level feature fusion of twostream convolutional neural network is established and trained. Experiments result show that the proposed network model can robustly track and recognize gestures, and compared with the single-channel model, the average detection accuracy is improved by 1.08%, and mean average precision (mAP) is improved by 3.56%. The average recognition rate of gestures under occlusion and different light intensity was 93.98%. Finally, in the ASL dataset, LaRED dataset, and 1-miohand dataset, recognition accuracy shows satisfactory performances compared to the other method.
As the development of deep learning and the continuous improvement of computing power, as well as the needs of social production, target detection has become a research hotspot in recent years. However, target detection algorithm has the problem that it is more sensitive to large targets and does not consider the feature-feature interrelationship, which leads to a high false detection or missed detection rate of small targets. An small target detection method (C-SSD) based on improved SSD is proposed, that replaces the backbone network VGG-16 of the SSD network with the improved dense convolution network (C-DenseNet) network to achieves further feature fusion through fast connections between dense blocks. The Introduction of residuals in the prediction layer and DIoU-NMS further improves the detection accuracy. Experimental results demonstrate that C-SSD outperforms other networks at three different image scales and achieves the best performance of 83. A 8% accuracy on the PASCAL VOC2007 test set, proving the effectiveness of the algorithm. C-SSD achieves a better balance of speed and accuracy, showing excellent performance in rapid detection of small targets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.