Binwei Xu scite author profile

Salient object location and segmentation are two different tasks in salient object detection (SOD). The former aims to globally find the most attractive objects in an image, whereas the latter can be achieved only using local regions that contain salient objects. However, previous methods mainly accomplish the two tasks simultaneously in a simple end-to-end manner, which leads to the ignorance of the differences between them. We assume that the human vision system orderly locates and segments objects, so we propose a novel progressive architecture with knowledge review network (PA-KRN) for SOD. It consists of three parts. (1) A coarse locating module (CLM) that uses body-attention label locates rough areas containing salient objects without boundary details. (2) An attention-based sampler highlights salient object regions with high resolution based on body-attention maps. (3) A fine segmenting module (FSM) finely segments salient objects. The networks applied in CLM and FSM are mainly based on our proposed knowledge review network (KRN) that utilizes the finest feature maps to reintegrate all previous layers, which can make up for the important information that is continuously diluted in the top-down path. Experiments on five benchmarks demonstrate that our single KRN can outperform state-of-the-art methods. Furthermore, our PA-KRN performs better and substantially surpasses the aforementioned methods.

show abstract

Video summarisation with visual and semantic cues

Liang

2020

IET image process

View full text Add to dashboard Cite

Synthesize Boundaries: A Boundary-aware Self-consistent Framework for Weakly Supervised Salient Object Detection

Xu¹,

Liang²,

Liang³

et al. 2022

Preprint

View full text Add to dashboard Cite

A progressive segmentation with weight contrast label enhancement for weakly supervised video salient object detection

Liang

et al. 2023

IET Image Processing

View full text Add to dashboard Cite

Scribble labels have gained increasing attention in the field of weakly supervised video salient object detection (VSOD). Based on scribble labels, latest methods can spread labeled pixels to unlabeled regions using local coherence loss, but predicted objects often lose detail and boundary information. In this work, a novel method based on back‐foreground weight contrast is proposed that adds label enhancement points to facilitate the model to learn the edge, detail and location of salient object. Additionally, a new VSOD framework based on global structural localization is introduced. Enhanced scribble labels are used to assist the model for global localization, and then the located regions are finely segmented by the trained model. Extensive experiments demonstrate that the method achieves the state‐of‐the‐art performance on common VSOD datasets, with an improvement of 3.75%, 4.68%, and 0.88% in S‐measure, F‐measure, and MAE, respectively.

show abstract

CFN: A coarse‐to‐fine network for eye fixation prediction

Liang

et al. 2022

IET Image Processing

View full text Add to dashboard Cite

Many image‐to‐image computer vision approaches have made great progress by an end‐to‐end framework with the encoder–decoder architecture. However, the same image‐to‐image eye fixation prediction task is not the same as those computer vision tasks in that it focuses more on salient regions rather than precise predictions for every pixel. Thus, it is not appropriate to directly apply the end‐to‐end encoder–decoder to the eye fixation prediction task. In addition, although high‐level feature is important, the contribution of low‐level feature should also be kept and balanced in computational model. Nevertheless, some low‐level features that attract attention are easily neglected while transiting through the deep network. Therefore, the effective way to integrate low‐level and high‐level features for improving eye fixation prediction performance is still a challenging task. In this paper, a coarse‐to‐fine network (CFN) that encompasses two pathways with different training strategies are proposed: coarse perceiving network (CFN‐Coarse) can be a simple encoder network or any of the existing pretrained network to capture the distribution of salient regions and generate high‐quality feature maps; fine integrating network (CFN‐Fine) uses fixed parameters from the CFN‐Coarse and combines features from deep to shallow in the deconvolution process by adding skip connections between down‐sampling and up‐sampling paths to efficiently integrate deep and shallow features. The saliency map obtained by the method is evaluated over 6 standard benchmark datasets, namely SALICON, MIT1003, MIT300, Toronto, OSIE, and SUN500. The results demonstrate that the method can surpass the state‐of‐the‐art accuracy of eye fixation prediction and achieves the competitive performance to date under most evaluation metrics on SALICON Saliency Prediction Challenge (LSUN2017).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Binwei Xu

Locate Globally, Segment Locally: A Progressive Architecture With Knowledge Review Network for Salient Object Detection

Video summarisation with visual and semantic cues

Synthesize Boundaries: A Boundary-aware Self-consistent Framework for Weakly Supervised Salient Object Detection

A progressive segmentation with weight contrast label enhancement for weakly supervised video salient object detection

CFN: A coarse‐to‐fine network for eye fixation prediction

Contact Info

Product

Resources

About