Saliency Unified: A Deep Architecture for simultaneous Eye Fixation Prediction and Salient Object Segmentation

Kruthiventi, Srinivas S. S.; Gudisa, Vennela; Dholakiya, Jaley H.; Babu, R. Venkatesh

doi:10.1109/cvpr.2016.623

Cited by 130 publications

(65 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The parameters corresponding to the universal saliency map channel and 1 × 1 conv layers for middle layer supervision are initialized with 'xavier'. Following the same initialization step in [10] and [13], we use the welltrained DeepNet model to initialize the corresponding parameters in our network. The network architecture of our Multi-task CNN is identical to that of DeepNet [10] except that: i) the parameters corresponding to tasks of different observers are different; ii) middle layer supervision is imposed by adding 1 × 1 conv layer after conv5 and conv6, respectively; iii) an additional channel corresponding to USM is added in the input.…”

Section: Methodsmentioning

confidence: 99%

“…Our personalized saliency detection exploits convolutional neural network (CNN) in light of its great success in multiple computer vision tasks, e.g., image classification [40], semantic segmentation [41], as well as saliency detection [42] [43] [12] [13]. An early approach of Ensembles of Deep Networks (eDN) [42] was proposed by Vig et al, where feature maps from different layers in a 3layers ConvNet are fed into a simple linear classifier for salient or non-salient classification.…”

Section: Cnn Based Saliency Detectionmentioning

confidence: 99%

See 1 more Smart Citation

Personalized Saliency and Its Prediction

Gao

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Almost all existing visual saliency models focus on predicting a universal saliency map across all observers. Yet psychology studies suggest that visual attention of different observers can vary a lot under some circumstances. In this paper, we set out to study this heterogenous visual attention pattern between different observers and build the first dataset for personalized saliency detection. Further, we propose to decompose a personalized saliency map (PSM) into a universal saliency map (USM) which can be predicted by any existing saliency detection models and a discrepancy between them. Then personalized saliency detection is casted as the task of discrepancy estimation between PSM and USM. To tackle this task we propose two solutions: i) The discrepancy estimation for different observers are casted as different but related tasks. Then we feed the image and its USM into a multi-task convolutional neural network framework to estimate the discrepancy between PSM and USM for each observer; ii) As the discrepancy is related to both image contents and the observers' person-specific information, we feed the image and its associated USM into a convolutional neural network with person-specific information encoded filters to estimate the discrepancy. Extensive experimental results demonstrate the effectiveness of our models for PSM prediction as well their generalization capability for unseen observers.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Cnn Based Saliency Detectionmentioning

confidence: 99%

Personalized Saliency and Its Prediction

Gao

et al. 2019

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

show abstract

“…Liu et al [32] and Achanta et al [33] presented salient region detection as an outgrowth of the binary object segmentation problem. Traditional salient object detection depends primarily on traditional machine learning methods such as bottom-up and top-down methods based on multilevel features [34] [38]. Li et al introduced semantic features into a multitask convolutional network to assist in salient object detection [39].…”

Section: A Salient Object Detectionmentioning

confidence: 99%

Salient instance segmentation via subitizing and clustering

et al. 2020

View full text Add to dashboard Cite

The goal of salient region detection is to identify the regions of an image that attract the most attention. Many methods have achieved state-of-the-art performance levels on this task. Recently, salient instance segmentation has become an even more challenging task than traditional salient region detection; however, few of the existing methods have concentrated on this underexplored problem. Unlike the existing methods, which usually employ object proposals to roughly count and locate object instances, our method applies salient objects subitizing to predict an accurate number of instances for salient instance segmentation. In this paper, we propose a multitask densely connected neural network (MDNN) to segment salient instances in an image. In contrast to existing approaches, our framework is proposal-free and category-independent. The MDNN contains two parallel branches: the first is a densely connected subitizing network (DSN) used for subitizing prediction; the second is a densely connected fully convolutional network (DFCN) used for salient region detection. The MDNN simultaneously outputs saliency maps and salient object subitizing. Then, an adaptive deep feature-based spectral clustering operation segments the salient regions into instances based on the subitizing and saliency maps. The experimental results on both salient region detection and salient instance segmentation datasets demonstrate the satisfactory performance of our framework. Notably, its AP r @0.7 reaches 60.14% in the salient instance dataset, surpasses the state-of-the-art methods by about 5%.

show abstract

“…He et al [45] adopted CNNs to characterize superpixels with hierarchical features so as to detect salient objects at multiple scales, while the superpixel-based saliency computation was used by [25], [46] as well. Considering that the task of fixation prediction is tightly correlated with SOD, a unified deep network was proposed in [47] for simultaneous fixation prediction and image-based SOD.…”

Section: B Modelsmentioning

confidence: 99%

A Benchmark Dataset and Saliency-Guided Stacked Autoencoders for Video-Based Salient Object Detection

Xia

Chen

2018

IEEE Trans. on Image Process.

131

View full text Add to dashboard Cite

Abstract-Image-based salient object detection (SOD) has been extensively studied in the past decades. However, videobased SOD is much less explored since there lack large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos (64 minutes). In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects that freeview all videos. From the user data, we find salient objects in video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for video-based salient object detection.Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliencyguided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are unsupervisedly constructed which automatically infer a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. Experimental results show that the proposed unsupervised approach outperforms 30 state-of-the-art models on the proposed dataset, including 19 image-based & classic (unsupervised or non-deep learning), 6 image-based & deep learning, and 5 video-based & unsupervised. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD.

show abstract

Saliency Unified: A Deep Architecture for simultaneous Eye Fixation Prediction and Salient Object Segmentation

Cited by 130 publications

References 32 publications

Personalized Saliency and Its Prediction

Personalized Saliency and Its Prediction

Salient instance segmentation via subitizing and clustering

A Benchmark Dataset and Saliency-Guided Stacked Autoencoders for Video-Based Salient Object Detection

Contact Info

Product

Resources

About