We propose a new approach, called self-motivated pyramid curriculum domain adaptation (PyCDA), to facilitate the adaptation of semantic segmentation neural networks from synthetic source domains to real target domains. Our approach draws on an insight connecting two existing works: curriculum domain adaptation and self-training. Inspired by the former, PyCDA constructs a pyramid curriculum which contains various properties about the target domain. Those properties are mainly about the desired label distributions over the target domain images, image regions, and pixels. By enforcing the segmentation neural network to observe those properties, we can improve the network's generalization capability to the target domain. Motivated by the self-training, we infer this pyramid of properties by resorting to the semantic segmentation network itself. Unlike prior work, we do not need to maintain any additional models (e.g., logistic regression or discriminator networks) or to solve min max problems which are often difficult to optimize. We report state-of-the-art results for the adaptation from both GTAV and SYNTHIA to Cityscapes, two popular settings in unsupervised domain adaptation for semantic segmentation.
This paper investigates the geometric consistency for monocular 3D object detection, which suffers from the illposed depth estimation. We first conduct a thorough analysis to reveal how existing methods fail to consistently localize objects when different geometric shifts occur. In particular, we design a series of geometric manipulations to diagnose existing detectors and then illustrate their vulnerability to consistently associate the depth with object apparent sizes and positions. To alleviate this issue, we propose four geometry-aware data augmentation approaches to enhance the geometric consistency of the detectors. We first modify some commonly used data augmentation methods for 2D images so that they can maintain geometric consistency in 3D spaces. We demonstrate such modifications are important. In addition, we propose a 3D-specific image perturbation method that employs the camera movement. During the augmentation process, the camera system with the corresponding image is manipulated, while the geometric visual cues for depth recovery are preserved. We show that by using the geometric consistency constraints, the proposed augmentation techniques lead to improvements on the KITTI and nuScenes monocular 3D detection benchmarks with state-of-the-art results. In addition, we demonstrate that the augmentation methods are well suited for semisupervised training and cross-dataset generalization.
Existing domain adaptation methods generally assume different domains have the identical label space, which is quite restrict for real world applications. In this paper, we focus on a more realistic and challenging case of open set domain adaptation. Particularly, in open set domain adaptation, we allow the classes from the source and target domains to be partially overlapped. In this case, the assumption of conventional distribution alignment does not hold anymore, due to the different label spaces in two domains. To tackle this challenge, we propose a new approach coined as Known-class Aware Self-Ensemble (KASE), which is built upon the recently developed self-ensemble model. In KASE, we first introduce a Known-class Aware Recognition (KAR) module to identify the known and unknown classes from the target domain, which is achieved by encouraging a low cross-entropy for known classes and a high entropy based on the source data from the unknown class. Then, we develop a Known-class Aware Adaptation (KAA) module to better adapt from the source domain to the target by reweighing the adaptation loss based on the likeliness to belong to known classes of unlabeled target samples as predicted by KAR. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness of our approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.