Saliency-Aware Video Object Segmentation

Yang, Yi; Shen, Jianbing; Yang, Ruigang; Porikli, Fatih

doi:10.1109/tpami.2017.2662005

Cited by 467 publications

(163 citation statements)

References 65 publications

Supporting

Mentioning

163

Contrasting

Order By: Relevance

“…Superpixel segmentation can be used as pre‐processing step for many computer vision tasks. In this paper, we apply superpixel segmentation in saliency detection [WSS15, WSYP18]. A recent method for saliency optimization from robust background detection [ZLWS14] uses superpixel segmentation as input to generate salient object detection results.…”

Section: Methodsmentioning

confidence: 99%

Superpixel Generation by Agglomerative Clustering With Quadratic Error Minimization

Dong

Chen

Yao

et al. 2018

Computer Graphics Forum

View full text Add to dashboard Cite

Superpixel segmentation is a popular image pre‐processing technique in many computer vision applications. In this paper, we present a novel superpixel generation algorithm by agglomerative clustering with quadratic error minimization. We use a quadratic error metric (QEM) to measure the difference of spatial compactness and colour homogeneity between superpixels. Based on the quadratic function, we propose a bottom‐up greedy clustering algorithm to obtain higher quality superpixel segmentation. There are two steps in our algorithm: merging and swapping. First, we calculate the merging cost of two superpixels and iteratively merge the pair with the minimum cost until the termination condition is satisfied. Then, we optimize the boundary of superpixels by swapping pixels according to their swapping cost to improve the compactness. Due to the quadratic nature of the energy function, each of these atomic operations has only O(1) time complexity. We compare the new method with other state‐of‐the‐art superpixel generation algorithms on two datasets, and our algorithm demonstrates superior performance.

show abstract

Section: Methodsmentioning

confidence: 99%

Superpixel Generation by Agglomerative Clustering With Quadratic Error Minimization

Dong

Chen

Yao

et al. 2018

Computer Graphics Forum

View full text Add to dashboard Cite

show abstract

“…3, we provide the features visualization of the proposed network. As visible, the features progressively become discriminative (close to the final saliency map) which can effectively distinguish the foreground and background, such as the features in CU(0,3) and CU (1,3). In addition, one can find that the detail features (e.g., edges and textures) in the encoder path become more and more abstract with the downsampling, while the cluttered and noisy backgrounds gradually vanish with the nested connections and up-sampling in the decoder path.…”

Section: A Frameworkmentioning

confidence: 99%

“…where F (0,3) and F (1,3) are the features extracted by the convolution units CU (0,3) and CU (1,3) , respectively.…”

Section: Encoder-decoder Module With Nested Connectionsmentioning

confidence: 99%

Nested Network With Two-Stream Pyramid for Salient Object Detection in Optical Remote Sensing Images

Cong

Hou

et al. 2019

IEEE Trans. Geosci. Remote Sensing

229

123

View full text Add to dashboard Cite

Arising from the various object types and scales, diverse imaging orientations, and cluttered backgrounds in optical remote sensing image (RSI), it is difficult to directly extend the success of salient object detection for nature scene image to the optical RSI. In this paper, we propose an end-to-end deep network called LV-Net based on the shape of network architecture, which detects salient objects from optical RSIs in a purely data-driven fashion. The proposed LV-Net consists of two key modules, i.e., a two-stream pyramid module (L-shaped module) and an encoder-decoder module with nested connections (V-shaped module). Specifically, the L-shaped module extracts a set of complementary information hierarchically by using a twostream pyramid structure, which is beneficial to perceiving the diverse scales and local details of salient objects. The V-shaped module gradually integrates encoder detail features with decoder semantic features through nested connections, which aims at suppressing the cluttered backgrounds and highlighting the salient objects. In addition, we construct the first publicly available optical RSI dataset for salient object detection, including 800 images with varying spatial resolutions, diverse saliency types, and pixel-wise ground truth. Experiments on this benchmark dataset demonstrate that the proposed method outperforms the state-of-the-art salient object detection methods both qualitatively and quantitatively.

show abstract

“…Then automatically bootstraps an appearance model based on the initial foreground estimate, and uses it to refine the spatial accuracy of the segmentation and to also segment the object in frames where it does not move. The works [44], [45], [46] extend the concept of salient objects detection [47] as prior knowledge to infer the objects. Semi-supervised video segmentation, which also refers to label propagation, is usually achieved via propagating human annotation specified on one or a few key-frames onto the entire video sequence [48], [49], [50].…”

Section: Moving Object Segmentationmentioning

confidence: 99%

Joint Stereo Video Deblurring, Scene Flow Estimation and Moving Object Segmentation

Pan

Dai

Liu

et al. 2020

IEEE Trans. on Image Process.

Self Cite

View full text Add to dashboard Cite

Stereo videos for the dynamic scenes often show unpleasant blurred effects due to the camera motion and the multiple moving objects with large depth variations. Given consecutive blurred stereo video frames, we aim to recover the latent clean images, estimate the 3D scene flow and segment the multiple moving objects. These three tasks have been previously addressed separately, which fail to exploit the internal connections among these tasks and cannot achieve optimality. In this paper, we propose to jointly solve these three tasks in a unified framework by exploiting their intrinsic connections. To this end, we represent the dynamic scenes with the piece-wise planar model, which exploits the local structure of the scene and expresses various dynamic scenes. Under our model, these three tasks are naturally connected and expressed as the parameter estimation of 3D scene structure and camera motion (structure and motion for the dynamic scenes). By exploiting the blur model constraint, the moving objects and the 3D scene structure, we reach an energy minimization formulation for joint deblurring, scene flow and segmentation. We evaluate our approach extensively on both synthetic datasets and publicly available real datasets with fastmoving objects, camera motion, uncontrolled lighting conditions and shadows. Experimental results demonstrate that our method can achieve significant improvement in stereo video deblurring, scene flow estimation and moving object segmentation, over stateof-the-art methods.

show abstract

Saliency-Aware Video Object Segmentation

Cited by 467 publications

References 65 publications

Superpixel Generation by Agglomerative Clustering With Quadratic Error Minimization

Superpixel Generation by Agglomerative Clustering With Quadratic Error Minimization

Nested Network With Two-Stream Pyramid for Salient Object Detection in Optical Remote Sensing Images

Joint Stereo Video Deblurring, Scene Flow Estimation and Moving Object Segmentation

Contact Info

Product

Resources

About