Semantic Correlation Promoted Shape-Variant Context for Segmentation

Ding, Henghui; Jiang, Xudong; Shuai, Bing; Liu, Ai Qun; Wang, Gang

doi:10.1109/cvpr.2019.00909

Cited by 173 publications

(99 citation statements)

References 67 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, the difference between deep layer and shallow layer in the use of context information leads to the variation of classification capacities. On the other hand, the spatial information of low level features is important to localize the classified objects, but these low level features also bring debatable noisy information that results in categorical errors [68]. In this paper, we rethink the relationship between shallow and corresponding deep layers in the skip connection at the feature level.…”

Section: Discussionmentioning

confidence: 99%

Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network

Gan

et al. 2019

Remote Sensing

View full text Add to dashboard Cite

Automated methods to extract buildings from very high resolution (VHR) remote sensing data have many applications in a wide range of fields. Many convolutional neural network (CNN) based methods have been proposed and have achieved significant advances in the building extraction task. In order to refine predictions, a lot of recent approaches fuse features from earlier layers of CNNs to introduce abundant spatial information, which is known as skip connection. However, this strategy of reusing earlier features directly without processing could reduce the performance of the network. To address this problem, we propose a novel fully convolutional network (FCN) that adopts attention based re-weighting to extract buildings from aerial imagery. Specifically, we consider the semantic gap between features from different stages and leverage the attention mechanism to bridge the gap prior to the fusion of features. The inferred attention weights along spatial and channel-wise dimensions make the low level feature maps adaptive to high level feature maps in a target-oriented manner. Experimental results on three publicly available aerial imagery datasets show that the proposed model (RFA-UNet) achieves comparable and improved performance compared to other state-of-the-art models for building extraction.to detect large objects [14,15]. Since Long et al. [16] adapted the classification network into fully convolutional network (FCN) for semantic segmentation, FCN and its extensions have gradually become the preferred solution in the field of semantic labeling [17][18][19][20]. Though FCN-based methods can produce dense pixel-wise output directly, the pixel-wise classification derived from the final score map is quite coarse because of the sequential sub-sampling operations in the FCN.To address the problem of coarse predictions, recent research [21][22][23][24][25][26] have further improved FCN-based methods for semantic labeling of remote sensing images. There is a growing body of literature that many studies [27][28][29][30][31] employ the encoder-decoder architecture with skip connection. UNet [32], a typical model in the style of encoder-decoder, reuses low-level information to refine the output, and results in better performance. For obtaining accurate labeling of VHR images, an effective structure to integrate the high-resolution, low-level features, and the low-resolution, high-level features is needed. The skip connection fuses features so as to compensate the loss of spatial information caused by repeating local operations (e.g., pooling and strided convolution). Features via skip connection are multi-scale in nature due to the increasingly large receptive field sizes [33]. However, one thing to note is that most existing approaches that are built on top of a contemporary classification network are good at aggregating global contexts. While the reuse of information from early encoding layers contributes to localization in the decoding phase, it may introduce redundant information which results in over-segmentation [3...

show abstract

Section: Discussionmentioning

confidence: 99%

Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network

Gan

et al. 2019

Remote Sensing

View full text Add to dashboard Cite

show abstract

“…Visual Relationship Detection. Visual relationship detection has been investigated by many works in the last decade [21,8,7,31]. Lu et al [29] introduce generic visual relationship detection as a visual task, where they detect objects first, and then recognize predicates between object pairs.…”

Section: Related Workmentioning

confidence: 99%

Scene Graph Generation With External Knowledge and Image Reconstruction

Zhao

Lin

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

303

177

View full text Add to dashboard Cite

Scene graph generation has received growing attention with the advancements in image understanding tasks such as object detection, attributes and relationship prediction, etc. However, existing datasets are biased in terms of object and relationship labels, or often come with noisy and missing annotations, which makes the development of a reliable scene graph prediction model very challenging. In this paper, we propose a novel scene graph generation algorithm with external knowledge and image reconstruction loss to overcome these dataset issues. In particular, we extract commonsense knowledge from the external knowledge base to refine object and phrase features for improving generalizability in scene graph generation. To address the bias of noisy object annotations, we introduce an auxiliary image reconstruction path to regularize the scene graph generation network. Extensive experiments show that our framework can generate better scene graphs, achieving the state-of-the-art performance on two benchmark datasets: Visual Relationship Detection and Visual Genome datasets.

show abstract

“…Fu et al [21] integrate local and global dependencies with both spatial and channel attention. Ding et al [17] employ semantic correlation to infer shape-variant context.…”

Section: Related Work 21 Scene Segmentationmentioning

confidence: 99%

Boundary-Aware Feature Propagation for Scene Segmentation

Ding

Jiang

Liu

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

239

107

View full text Add to dashboard Cite

In this work, we address the challenging issue of scene segmentation. To increase the feature similarity of the same object while keeping the feature discrimination of different objects, we explore to propagate information throughout the image under the control of objects' boundaries. To this end, we first propose to learn the boundary as an additional semantic class to enable the network to be aware of the boundary layout. Then, we propose unidirectional acyclic graphs (UAGs) to model the function of undirected cyclic graphs (UCGs), which structurize the image via building graphic pixel-by-pixel connections, in an efficient and effective way. Furthermore, we propose a boundaryaware feature propagation (BFP) module to harvest and propagate the local features within their regions isolated by the learned boundaries in the UAG-structured image. The proposed BFP is capable of splitting the feature propagation into a set of semantic groups via building strong connections among the same segment region but weak connections between different segment regions. Without bells and whistles, our approach achieves new state-of-the-art segmentation performance on three challenging semantic segmentation datasets, i.e., PASCAL-Context, CamVid, and Cityscapes.

show abstract

Semantic Correlation Promoted Shape-Variant Context for Segmentation

Cited by 173 publications

References 67 publications

Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network

Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network

Scene Graph Generation With External Knowledge and Image Reconstruction

Boundary-Aware Feature Propagation for Scene Segmentation

Contact Info

Product

Resources

About