Tuan-Hung Vu scite author profile

Semantic segmentation is a key problem for many computer vision tasks. While approaches based on convolutional neural networks constantly break new records on different benchmarks, generalizing well to diverse testing environments remains a major challenge. In numerous real world applications, there is indeed a large gap between data distributions in train and test domains, which results in severe performance loss at run-time. In this work, we address the task of unsupervised domain adaptation in semantic segmentation with losses based on the entropy of the pixel-wise predictions. To this end, we propose two novel, complementary methods using (i) an entropy loss and (ii) an adversarial loss respectively. We demonstrate state-of-theart performance in semantic segmentation on two challenging "synthetic-2-real" set-ups 1 and show that the approach can also be used for detection.

show abstract

DADA: Depth-Aware Domain Adaptation in Semantic Segmentation

Jain

Bucher

et al. 2019

184

147

View full text Add to dashboard Cite

Figure 1: We propose a novel depth-aware domain adaptation framework (DADA) to efficiently leverage depth as privileged information in the unsupervised domain adaptation setting. This example shows how semantic segmentation of a scene from the target domain benefits from the proposed approach, in comparison to state-of-the-art domain adaptation with no use of depth. In figure's top, we use different background colors (blue and red) to represent source and target information that are available during training. Here, annotated source domain data come from the synthetic SYNTHIA dataset and un-annotated target domain images are real scenes from Cityscapes. The cyclist highlighted by the yellow box is a good qualitative illustration of the improvement we obtain. AbstractUnsupervised domain adaptation (UDA) is important for applications where large scale annotation of representative data is challenging. For semantic segmentation in particular, it helps deploy, on real "target domain" data, models that are trained on annotated images from a different "source domain", notably a virtual environment. To this end, most previous works consider semantic segmentation as the only mode of supervision for source domain data, while ignoring other, possibly available, information like depth. In this work, we aim at exploiting at best such a privileged information while training the UDA model. We propose a unified depth-aware UDA framework that leverages in several complementary ways the knowledge of dense depth in the source domain. As a result, the performance of the trained semantic segmentation model on the target domain is boosted. Our novel approach indeed achieves state-of-the-art performance on different challenging synthetic-2-real benchmarks. Code and models are available at https://github.com/ valeoai/DADA.

show abstract

xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation

et al. 2020

View full text Add to dashboard Cite

Context-Aware CNNs for Person Head Detection

Vu¹,

Osokin²,

Laptev³

2015

111

View full text Add to dashboard Cite

Person detection is a key problem for many computer vision tasks. While face detection has reached maturity, detecting people under a full variation of camera view-points, human poses, lighting conditions and occlusions is still a difficult challenge. In this work we focus on detecting human heads in natural scenes. Starting from the recent local R-CNN object detector, we extend it with two types of contextual cues. First, we leverage person-scene relations and propose a Global CNN model trained to predict positions and scales of heads directly from the full image. Second, we explicitly model pairwise relations among objects and train a Pairwise CNN model using a structured-output surrogate loss. The Local, Global and Pairwise models are combined into a joint CNN framework. To train and test our full model, we introduce a large dataset composed of 369, 846 human heads annotated in 224, 740 movie frames. We evaluate our method and demonstrate improvements of person head detection against several recent baselines in three datasets. We also show improvements of the detection speed provided by our model.

show abstract

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

Saporta

Cord

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tuan-Hung Vu

ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation

DADA: Depth-Aware Domain Adaptation in Semantic Segmentation

xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation

Context-Aware CNNs for Person Head Detection

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

Contact Info

Product

Resources

About