A robot-assisted feeding system can potentially help a user with upper-body mobility impairments eat independently. However, autonomous assistance in the real world is challenging because of varying user preferences, impairment constraints, and possibility of errors in uncertain and unstructured environments. An autonomous robot-assisted feeding system needs to decide the appropriate strategy to acquire a bite of hard-to-model deformable food items, the right time to bring the bite close to the mouth, and the appropriate strategy to transfer the bite easily. Our key insight is that a system should be designed based on a user's preference about these various challenging aspects of the task. In this work, we explore user preferences for different modes of autonomy given perceived error risks and also analyze the effect of input modalities on technology acceptance. We found that more autonomy is not always better, as participants did not have a preference to use a robot with partial autonomy over a robot with low autonomy. In addition, participants' user interface preference changes from voice control during individual dining to web-based during social dining. Finally, we found differences on average ratings when grouping the participants based on their mobility limitations (lower vs. higher) that suggests that ratings from participants with lower mobility limitations are correlated with higher expectations of robot performance.
CCS CONCEPTS• Human-centered computing → Empirical studies in accessibility; • Social and professional topics → People with disabilities; Assistive technologies; • Computer systems organization → Robotic autonomy.
This paper presents a dataset of natural language instructions for object reference in manipulation scenarios. It comprises 1582 individual written instructions, which were collected via online crowdsourcing. This dataset is particularly useful for researchers who work in natural language processing, human-robot interaction, and robotic manipulation. In addition to serving as a rich corpus of domain-specific language, it provides a benchmark of image-instruction pairs to be used in system evaluations and uncovers inherent challenges in tabletop object specification. Example code is provided for easy access via Python.
We use static object data to improve success detection for stacking objects on and nesting objects in one another. Such actions are necessary for certain robotics tasks, e.g., clearing a dining table or packing a warehouse bin. However, using an RGB-D camera to detect success can be insufficient: same-colored objects can be difficult to differentiate, and reflective silverware cause noisy depth camera perception. We show that adding static data about the objects themselves improves the performance of an end-to-end pipeline for classifying action outcomes. Images of the objects, and language expressions describing them, encode prior geometry, shape, and size information that refine classification accuracy. We collect over 13 hours of egocentric manipulation data for training a model to reason about whether a robot successfully placed unseen objects in or on one another. The model achieves up to a 57% absolute gain over the task baseline on pairs of previously unseen objects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.