Optimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric Learning

Jund, Philipp; Eitel, Andreas; Abdo, Nichola; Burgard, Wolfram

doi:10.1109/icra.2018.8460220

Cited by 18 publications

(14 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…MSR-based systems often learn the grounding of spatial relations offline or in a separate training phase. Specific instances of QSR and MSR have been used for different tasks in robotics and computer vision, e.g., QSR relations have been extracted from videos [19], MSR and kd-trees have been used to infer spatial relations between objects [67], QSR and MSR have been compared for scene understanding on robots [58], the relative position of objects has been used to predict successful action execution [16], and methods have been developed to reason about and learn spatial relations between objects [26,28]. Specialized meetings have explored the use of natural language to describe spatial relationships between objects [11,59].…”

Section: Related Workmentioning

confidence: 99%

Combining Commonsense Reasoning and Knowledge Acquisition to Guide Deep Learning in Robotics

Sridharan¹,

Mota²

2022

Preprint

View full text Add to dashboard Cite

Algorithms based on deep network models are being used for many pattern recognition and decisionmaking tasks in robotics and AI. Training these models requires a large labeled dataset and considerable computational resources, which are not readily available in many domains. Also, it is difficult to explore the internal representations and reasoning mechanisms of these models. As a step towards addressing the underlying knowledge representation, reasoning, and learning challenges, the architecture described in this paper draws inspiration from research in cognitive systems. As a motivating example, we consider an assistive robot trying to reduce clutter in any given scene by reasoning about the occlusion of objects and stability of object configurations in an image of the scene. In this context, our architecture incrementally learns and revises a grounding of the spatial relations between objects and uses this grounding to extract spatial information from input images. Non-monotonic logical reasoning with this information and incomplete commonsense domain knowledge is used to make decisions about stability and occlusion. For images that cannot be processed by such reasoning, regions relevant to the tasks at hand are automatically identified and used to train deep network models to make the desired decisions. Image regions used to train the deep networks are also used to incrementally acquire previously unknown state constraints that are merged with the existing knowledge for subsequent reasoning. Experimental evaluation performed using simulated and real-world images indicates that in comparison with baselines based just on deep networks, our architecture improves reliability of decision making and reduces the effort involved in training data-driven deep network models.

show abstract

Section: Related Workmentioning

confidence: 99%

Combining Commonsense Reasoning and Knowledge Acquisition to Guide Deep Learning in Robotics

Sridharan¹,

Mota²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…With regards to adaption policies, we do not yet model the spatial relations amongst the actors of interest; namely, the robot (end-effector), active objects (like objects to be gripped and the packaging box), and the world (support surfaces like tables and floor). These relationships provide important context for decision making and are recently attracting more attention [51][52][53][54]. Without spatial relation understanding, the solutions learned in Exp.…”

Section: Limitations Comparisons and Future Workmentioning

confidence: 99%

Endowing Robots with Longer-term Autonomy by Recovering from External Disturbances in Manipulation Through Grounded Anomaly Classification and Recovery Policies

Luo

Duan

et al. 2021

J Intell Robot Syst

View full text Add to dashboard Cite

Robots are poised to interact with humans in unstructured environments. Despite increasingly robust control algorithms, failure modes arise whenever the underlying dynamics are poorly modeled, especially in unstructured environments. We contribute a set of recovery policies to deal with anomalies produced by external disturbances. The recoveries work when various different types of anomalies are triggered any number of times at any point in the task, including during already running recoveries. Our recovery critic stands atop of a tightly-integrated, graph-based online motion-generation and introspection system. Policies, skills, and introspection models are learned incrementally and contextually over time. Recoveries are studied via a collaborative kitting task where a wide range of anomalous conditions are experienced in the system. We also contribute an extensive analysis of the performance of the tightly integrated anomaly identification, classification, and recovery system under extreme anomalous conditions. We show how the integration of such a system achieves performances greater than the sum of its parts.

show abstract

“…Learning spatial relations by relying on the geometries of objects provides a robot with the necessary capability to carry out tasks that require understanding object interactions, such as object placing [1], [2], human robot interaction [15]- [18], object manipulation [5] or generalizing spatial relations to new objects [3], [4], [19]. Commonly, spatial relations are modeled based on the geometries of objects given their point cloud models [3]- [5].…”

Section: Related Workmentioning

confidence: 99%

Learning Object Placements For Relational Instructions by Hallucinating Scene Representations

Mees¹,

Emek²,

Vertens³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Robots coexisting with humans in their environment and performing services for them need the ability to interact with them. One particular requirement for such robots is that they are able to understand spatial relations and can place objects in accordance with the spatial relations expressed by their user. In this work, we present a convolutional neural network for estimating pixelwise object placement probabilities for a set of spatial relations from a single input image.During training, our network receives the learning signal by classifying hallucinated high-level scene representations as an auxiliary task. Unlike previous approaches, our method does not require ground truth data for the pixelwise relational probabilities or 3D models of the objects, which significantly expands the applicability in practical applications. Our results obtained using real-world data and human-robot experiments demonstrate the effectiveness of our method in reasoning about the best way to place objects to reproduce a spatial relation. Videos of our experiments can be found at https://youtu. be/zaZkHTWFMKM

show abstract

Optimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric Learning

Cited by 18 publications

References 23 publications

Combining Commonsense Reasoning and Knowledge Acquisition to Guide Deep Learning in Robotics

Combining Commonsense Reasoning and Knowledge Acquisition to Guide Deep Learning in Robotics

Endowing Robots with Longer-term Autonomy by Recovering from External Disturbances in Manipulation Through Grounded Anomaly Classification and Recovery Policies

Learning Object Placements For Relational Instructions by Hallucinating Scene Representations

Contact Info

Product

Resources

About