As robots begin to collaborate with humans in their daily work spaces, they need to have a deeper understanding of the tasks of using tools. In response to the problem of using tools in collaboration between humans and robots, we propose a modular system based on collaborative tasks. The first part of the system is designed to find task-related operating areas, and a multi-layer instance segmentation network is used to find the tools needed for the task, and classify the object itself based on the state of the robot in the collaborative task. Thus, we generate the state semantic region with the “leader-assistant” state. In the second part, in order to predict the optimal grasp and handover configuration, a multi-scale grasping network (MGR-Net) based on the mask of state semantic area is proposed, it can better adapt to the change of the receptive field caused by the state semantic region. Compared with the traditional method, our method has higher accuracy. The whole system also achieves good results on untrained real-world tool dataset we constructed. To further verify the effectiveness of our generated grasp representations, A robot platform based on Sawyer is used to prove the high performance of our system.
Task planning is a crucial component in facilitating robot multi-task manipulations. Language-based task planning methods offer practicality in receiving commands from humans in real-life scenarios and require only low-cost labeled data. However, existing methods often rely on sequence models for planning, which primarily focus on mapping language to sequences of sub-tasks while neglecting the knowledge about tasks and objects. To overcome these limitations, we propose a knowledge-based task planning approach called Recurrent Graph Convolutional Network (RGCN). It is devised with a novel structure that combined GCN (Kipf and Welling in International Conference on Learning Representations (ICLR), 2017) and LSTM (Hochreiter and chmidhuber in Neural Comput 9 (8): 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735) which enables it to leverage knowledge graph data and historical predictions. The experimental results demonstrate that our approach achieves the impressive task planning success rate of $${95.7\%}$$ 95.7 % , surpassing the best baseline method significantly, which achieves $${78.7\%}$$ 78.7 % . Furthermore, we evaluate the performance of multi-task manipulation across a specific set of 20 tasks within a simulated environment. Notably, RGCN combined with pre-trained primitive tasks exhibits the highest success rate compared with state-of-art multi-task learning methods. Our method is proven to be significant for language-conditioned task planning and is qualified for instructing robots for multi-task manipulation.
An accurate and robust keypoint detection method is vital for autonomous harvesting systems. This paper proposed a dome-type planted pumpkin autonomous harvesting framework with keypoint (grasping and cutting points) detection method using instance segmentation architecture. To address the overlapping problem in agricultural environment and improve the segmenting precision, we proposed a pumpkin fruit and stem instance segmentation architecture by fusing transformer and point rendering. A transformer network is utilized as the architecture backbone to achieve a higher segmentation precision and point rendering is applied so that finer masks can be acquired especially at the boundary of overlapping areas. In addition, our keypoint detection algorithm can model the relationships among the fruit and stem instances as well as estimate grasping and cutting keypoints. To validate the effectiveness of our method, we created a pumpkin image dataset with manually annotated labels. Based on the dataset, we have carried out plenty of experiments on instance segmentation and keypoint detection. Pumpkin fruit and stem instance segmentation results show that the proposed method reaches the mask mAP of 70.8% and box mAP of 72.0%, which brings 4.9% and 2.5% gains over the state-of-the-art instance segmentation methods such as Cascade Mask R-CNN. Ablation study proves the effectiveness of each improved module in the instance segmentation architecture. Keypoint estimation results indicate that our method has a promising application prospect in fruit picking tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.