2021
DOI: 10.1109/lra.2021.3062004
|View full text |Cite
|
Sign up to set email alerts
|

How to Select and Use Tools? : Active Perception of Target Objects Using Multimodal Deep Learning

Abstract: Selection of appropriate tools and use of them when performing daily tasks is a critical function for introducing robots for domestic applications. In previous studies, however, adaptability to target objects was limited, making it difficult to accordingly change tools and adjust actions. To manipulate various objects with tools, robots must both understand tool functions and recognize object characteristics to discern a tool-object-action relation. We focus on active perception using multimodal sensorimotor d… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 39 publications
(21 citation statements)
references
References 20 publications
0
21
0
Order By: Relevance
“…Therefore, when a command with the same position and task completion time was input, the same motion with very little variation was generated without considering the size and shape of the object. Recently, methods using raw images have been studied [51], [52]. These studies add raw images to the NN input and learn to generate appropriate motions in response to changes in the position, and shape of the object.…”
Section: ) Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, when a command with the same position and task completion time was input, the same motion with very little variation was generated without considering the size and shape of the object. Recently, methods using raw images have been studied [51], [52]. These studies add raw images to the NN input and learn to generate appropriate motions in response to changes in the position, and shape of the object.…”
Section: ) Resultsmentioning
confidence: 99%
“…However, the spatial information in this study is limited to the center position of the pancake at the beginning of the task, and the shape and size of the object were not considered. Therefore, our future work is to integrate the proposed method with a real-time image-based motion generation method [51] and a method that considers the shape and size of multiple objects [52] to expand the tasks that can be performed by the robot in space and time.…”
Section: Discussionmentioning
confidence: 99%
“…Cf nodes with small time constants learn movement primitives in the data, whereas Cs nodes with large time constants learn sequences. By combining these three node types, long, complex time series data can be learned, the usefulness of which for manipulation has been confirmed in several studies Yang et al (2016) , Takahashi et al (2017) , and Saito et al (2018b , a , 2020 , 2021) .…”
Section: Tool-use Modelmentioning
confidence: 90%
“…In the previous study, they used only image data as sensory input, making it difficult to operate small objects that can be occluded during movement. Many papers have shown that using both vision and force can improve the accuracy of object recognition in both cognitive field and robotics field Fukui and Shimojo (1994) , Ernst and Banks (2002) , Liu et al (2017) , and Saito et al (2021) . By constructing the tool-use model with multimodal DNNs, we realize the robot to operate much more complex tools and objects than in the past study.…”
Section: Related Workmentioning
confidence: 99%
“…In recent years, techniques to control robots using deep learning have been developed to increase the accuracy of recognition [4,5]. Some robot control systems have also been developed to process multimodal information to obtain higher recognition [6][7][8][9][10]. Systems that process multimodal information face two challenges [11].…”
Section: Introductionmentioning
confidence: 99%