The task of searching and grasping objects in cluttered scenes, typical of robotic applications in domestic environments requires fast object detection and segmentation. Attentional mechanisms provide a means to detect and prioritize processing of objects of interest. In this work, we combine a saliency operator based on symmetry with a segmentation method based on clustering locally planar surface patches, both operating on 2.5D point clouds (RGB-D images) as input data to yield a novel approach to table-top scene segmentation. Evaluation on indoor table-top scenes containing man-made objects clustered in piles and dumped in a box show that our approach to selection of attention points significantly improves performance of state-of-the-art attention-based segmentation methods.
We present a novel method based on saliency and segmentation to generate generic object candidates from RGB-D data. Our method uses saliency as a cue to roughly estimate the location and extent of the objects present in the scene. Salient regions are used to glue together the segments obtained from over-segmenting the scene by either color or depth segmentation algorithms, or by a combination of both. We suggest a late-fusion approach that first extracts segments from color and depth independently before fusing them to exploit that the data is complementary. Furthermore, we investigate several mechanisms for ranking the object candidates. We evaluate on one publicly available dataset and on one challenging sequence with a high degree of clutter. The results show that we are able to retrieve most objects in real-world indoor scenes and clearly outperform other state-of-the art methods.
3D visual attention plays an important role in both human and robotics perception that yet has to be explored in full detail. However, the majority of computer vision and robotics methods are concerned only with 2D visual attention. This survey presents findings and approaches that cover 3D visual attention in both human and robot vision, summarizing the last 30 years of research and also looking beyond computational methods. First, we present work in such fields as biological vision and neurophysiology, studying 3D attention in human observers. This provides a view of the role attention plays at the system level for biological vision. Then, we cover computer and robot vision approaches that take 3D visual attention into account. We compare approaches with respect to different categories, such as feature-based, data-based, or depthbased visual attention, and draw conclusions on what advances will help robotics to cope better with complex real-world settings and tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.