This paper presents a bottom-up approach for perceptual segmentation of natural images. The segmentation algorithm consists of two consecutive stages: firstly, the input image is partitioned into a set of blobs of uniform colour (pre-segmentation stage) and then, using a more complex distance which integrates edge and region descriptors, these blobs are hierarchically merged (perceptual grouping). Both stages are addressed using the Combinatorial Pyramid, a hierarchical structure which can correctly encode relationships among image regions at upper levels. Thus, unlike other methods, the topology of the image is preserved. The performance of the proposed approach has been initially evaluated with respect to groundtruth segmentation data using the Berkeley Segmentation Dataset and Benchmark. Although additional descriptors must be added to deal with textured surfaces, experimental results reveal that the proposed perceptual grouping provides satisfactory scores.
In biological vision systems, attention mechanisms are responsible for selecting the relevant information from the sensed field of view, so that the complete scene can be analyzed using a sequence of rapid eye saccades. In recent years, efforts have been made to imitate such attention behavior in artificial vision systems, because it allows optimizing the computational resources as they can be focused on the processing of a set of selected regions. In the framework of mobile robotics navigation, this work proposes an artificial model where attention is deployed at the level of objects (visual landmarks) and where new processes for estimating bottom-up and top-down (target-based) saliency maps are employed. Bottom-up attention is implemented through a hierarchical process, whose final result is the perceptual grouping of the image content. The hierarchical grouping is applied using a Combinatorial Pyramid that represents each level of the hierarchy by a combinatorial map. The process takes into account both image regions (faces in the map) and edges (arcs in the map). Top-down attention searches for previously detected landmarks, enabling their re-detection when the robot presumes that it is revisiting a known location. Landmarks are described by a combinatorial submap; thus, this search is conducted through an error-tolerant submap isomorphism procedure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.