SUM: Sequential scene understanding and manipulation

Sui, Zhiqiang; Zhou, Zebing; Zeng, Zhen; Jenkins, Odest Chadwicke

doi:10.1109/iros.2017.8206164

Cited by 36 publications

(27 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, some limitations occur regularly, including framing the segmentation problem in 2D or 2.5D space [13] (i.e. not estimating full volumetric occupancy), relying on pre-specified [14] or simple geometric models [15] in the scene, or restricting the belief about the scene to a unimodal representation, even if tracking is performed in a multimodal fashion [16]. Working in 3D, as opposed to 2D or 2.5D, is particularly important, as it allows us to construct and retain object geometry estimates in the presence of occlusion, which is frequent in cluttered scenes.…”

Section: Related Workmentioning

confidence: 99%

“…Related work in sensor fusion via particle filter, which has been commonly used for non-linear state estimation, has seen various improvements on sampling sufficient valid states and avoiding degeneracy of the proposal distribution [33], [34]. Although these probabilistic methods have been used in manipulation [13], [14], they either only produce 2D estimates, or require prior knowledge of object models.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Fusing RGBD Tracking and Segmentation Tree Sampling for Multi-Hypothesis Volumetric Segmentation

Price¹,

Huang²,

Berenson³

2021

Preprint

View full text Add to dashboard Cite

Despite rapid progress in scene segmentation in recent years, 3D segmentation methods are still limited when there is severe occlusion. The key challenge is estimating the segment boundaries of (partially) occluded objects, which are inherently ambiguous when considering only a single frame. In this work, we propose Multihypothesis Segmentation Tracking (MST), a novel method for volumetric segmentation in changing scenes, which allows scene ambiguity to be tracked and our estimates to be adjusted over time as we interact with the scene. Two main innovations allow us to tackle this difficult problem: 1) A novel way to sample possible segmentations from a segmentation tree; and 2) A novel approach to fusing tracking results with multiple segmentation estimates. These methods allow MST to track the segmentation state over time and incorporate new information, such as new objects being revealed. We evaluate our method on several cluttered tabletop environments in simulation and reality. Our results show that MST outperforms baselines in all tested scenes.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Fusing RGBD Tracking and Segmentation Tree Sampling for Multi-Hypothesis Volumetric Segmentation

Price¹,

Huang²,

Berenson³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Discriminative-generative algorithms [25], [24], [12] offer a promising avenue for robust perception and action. Such methods combine inference by deep learning with sampling and probabilistic inference models, and the ability to represent actual and counterfactual experiments to achieve robust and adaptive understanding.…”

Section: Introductionmentioning

confidence: 99%

Hardware acceleration of robot scene perception algorithms

Liu

Derman

Calderoni

et al. 2020

Proceedings of the 39th International Conference on Computer-Aided Design

View full text Add to dashboard Cite

Hybrid machine learning algorithms that combine deep learning with probabilistic inference techniques provide highly accurate scene perception for robot manipulation. In particular, a 2-stage approach that combines object detection using convolutional neural networks with Monte-Carlo sampling for pose estimation has been shown to perform particularly well under adversarial scenarios. Unfortunately, this accuracy comes at the cost of high computational complexity, which affects runtime, resource utilization, and energy consumption. This paper describes various challenges in developing complexity-aware techniques for robust robot perception and presents a novel hardware accelerator that addresses these challenge. Experimental results show our design is at least 30% faster and consumes 97% less energy compared to an implementation on a high-end GPU. Compared to a low-power GPU implementation, our design is 95% faster while consuming 96% less energy, demonstrating that accurate, energy-efficient scene perception is possible in real time with targeted hardware acceleration. CCS CONCEPTS• Hardware → Hardware accelerators; • Computing methodologies → Rasterization; • Computer systems organization → Real-time system architecture.

show abstract

“…Generative-discriminative algorithms [7], [8] offer a promising avenue for robust perception. Such methods combine inference by deep learning (or other discriminative techniques) with sampling and probabilistic inference models to achieve robust and adaptive perception in adversarial environments.…”

Section: Introductionmentioning

confidence: 99%

GRIP: Generative Robust Inference and Perception for Semantic Robot Manipulation in Adversarial Environments

Chen

Sui

et al. 2019

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

Recent advancements have led to a proliferation of machine learning systems used to assist humans in a wide range of tasks. However, we are still far from accurate, reliable, and resource-efficient operations of these systems. For robot perception, convolutional neural networks (CNNs) for object detection and pose estimation are recently coming into widespread use. However, neural networks are known to suffer from overfitting during the training process and are less robust under unforeseen conditions (which makes them especially vulnerable to adversarial scenarios). In this work, we propose Generative Robust Inference and Perception (GRIP) as a two-stage object detection and pose estimation system that aims to combine the relative strengths of discriminative CNNs and generative inference methods to achieve robust estimation. Our results show that a second stage of samplebased generative inference is able to recover from false object detections by CNNs, and produce robust estimations in adversarial conditions. We demonstrate the efficacy of GRIP robustness through comparison with state-of-the-art learningbased pose estimators and pick-and-place manipulation in dark and cluttered environments.

show abstract

SUM: Sequential scene understanding and manipulation

Cited by 36 publications

References 36 publications

Fusing RGBD Tracking and Segmentation Tree Sampling for Multi-Hypothesis Volumetric Segmentation

Fusing RGBD Tracking and Segmentation Tree Sampling for Multi-Hypothesis Volumetric Segmentation

Hardware acceleration of robot scene perception algorithms

GRIP: Generative Robust Inference and Perception for Semantic Robot Manipulation in Adversarial Environments

Contact Info

Product

Resources

About