Natural language instructions for human–robot collaborative manipulation

Scalise, Rosario; Shen, Li; Admoni, Henny; Rosenthal, Stephanie; Srinivasa, Siddhartha S.

doi:10.1177/0278364918760992

Cited by 26 publications

(15 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In the field of human-robot interaction (HRI), computer vision [28,29] and speech recognition [30][31][32] are most commonly used as the primary modes of sensing and interaction. However, there has been a recognition of human touch and tactile HRI as an important mode of physical interaction [33].…”

Section: Overview Of Tactile Sensingmentioning

confidence: 99%

Soft Tactile Sensing Skins for Robotics

2021

View full text Add to dashboard Cite

Purpose of Review Soft electronic skins (E-skins) capable of tactile pressure sensing have the potential to endow robotic systems with many of the same somatosensory properties of natural human skin. In this progress report, we review recent progress in creating soft tactile pressure sensing skins to give robots a sense of touch that resembles human skin sensing.Recent Findings For soft tactile pressure sensing skins, researchers have focused on five main sensing principles: (1) resistive; (2) capacitive; (3) magnetic; (4) barometric; and (5) optical. The combination of these traditional sensing techniques, along with the use of soft materials such as liquid metal and magnetic elastomers, has improved the perception capabilities and mechanical characteristics of artificial skin. In addition, the implementation of artificial intelligence and machine learning algorithms for data processing give robotic systems with these soft sensing skins an enhanced sense of touch.Summary E-skins for tactile sensing have a central role in a range of robotic applications, from haptics and teleoperation to bio-inspired soft robots. For many of these applications, E-skins must be soft, thin, flexible, stretchable, and lightweight so that they can be mounted on a robot, incorporated into clothing, or placed on human skin without interfering with mobility or contact mechanics. Significant research has been conducted on sensing techniques that can allow a robot to achieve a sense of human touch, with important progress being made in force feedback sensing, texture recognition, and spatial acuity. We begin by covering principles of tactile sensing in humans, robotics, and human-machine interaction. This is followed by an overview of soft material transducers capable of pressure and force sensing. This includes resistive, capacitive, magnetic, barometric, and optical sensing techniques. We close with a summary of emerging trends in sensor design and implementations for applications in robotics.

show abstract

Section: Overview Of Tactile Sensingmentioning

confidence: 99%

Soft Tactile Sensing Skins for Robotics

2021

View full text Add to dashboard Cite

show abstract

“…Nyga and Beetz [22] curated data sets that provide a sequence How-To instructions for tasks such as preparing recipes. Others such as Jain et al [23], Scalise et al [24] and Mandlekar et al [25] present simulation environments and data sets for tasks such as learning spatial affordances, situated interaction or learning low-level motor skills. The present data sets possess two limitations that make them less usable for the learning task addressed in this work.…”

Section: Related Workmentioning

confidence: 99%

ToolNet: Using Commonsense Generalization for Predicting Tool Use for Robot Plan Synthesis

Rajas¹,

Tuli²,

Paul³

et al. 2020

Preprint

View full text Add to dashboard Cite

A robot working in a physical environment (like home or factory) needs to learn to use various available tools for accomplishing different tasks, for instance, a mop for cleaning and a tray for carrying objects. The number of possible tools is large and it may not be feasible to demonstrate usage of each individual tool during training. Can a robot learn commonsense knowledge and adapt to novel settings where some known tools are missing, but alternative unseen tools are present? We present a neural model that predicts the best tool from the available objects for achieving a given declarative goal. This model is trained by user demonstrations, which we crowd-source through humans instructing a robot in a physics simulator. This dataset maintains user plans involving multi-step object interactions along with symbolic state changes. Our neural model, TOOLNET, combines a graph neural network to encode the current environment state, and goal-conditioned spatial attention to predict the appropriate tool. We find that providing metric and semantic properties of objects, and pre-trained object embeddings derived from a commonsense knowledge repository such as ConceptNet, significantly improves the model's ability to generalize to unseen tools. The model makes accurate and generalizable tool predictions. When compared to a graph neural network baseline, it achieves 14-27% accuracy improvement for predicting known tools from new world scenes, and 44-67% improvement in generalization for novel objects not encountered during training.

show abstract

Section: B Visually Grounding Referring Expressionsmentioning

confidence: 99%

“…The task of visually grounding referring expressions have been also related to the object manipulation [12], [13], for enabling robots to find the target object to manipulate. [12] proposed a dataset that can be used to train a robot or agent that can pick up the target object referred by human language instructions. [13] developed a system which can distinguish the target object referred by human language instruction when multiple objects belonging to the same class are given.…”

Section: B Visually Grounding Referring Expressionsmentioning

confidence: 99%

Visually Grounding Language Instruction for History-Dependent Manipulation

Ahn¹,

Kwon²,

Kim³

et al. 2020

Preprint

View full text Add to dashboard Cite

This paper emphasizes the importance of robot's ability to refer its task history, when it executes a series of pickand-place manipulations by following text instructions given one by one. The advantage of referring the manipulation history can be categorized into two folds: (1) the instructions omitting details or using co-referential expressions can be interpreted, and (2) the visual information of objects occluded by previous manipulations can be inferred. For this challenge, we introduce the task of history-dependent manipulation which is to visually ground a series of text instructions for proper manipulations depending on the task history. We also suggest a relevant dataset and a methodology based on the deep neural network, and show that our network trained with a synthetic dataset can be applied to the real world based on images transferred into synthetic-style based on the CycleGAN.

show abstract

Natural language instructions for human–robot collaborative manipulation

Cited by 26 publications

References 23 publications

Soft Tactile Sensing Skins for Robotics

Soft Tactile Sensing Skins for Robotics

ToolNet: Using Commonsense Generalization for Predicting Tool Use for Robot Plan Synthesis

Visually Grounding Language Instruction for History-Dependent Manipulation

Contact Info

Product

Resources

About