Figure 1. 5D temporal reconstruction of multiple 3D people in global coordinates with a dynamic camera. We introduce TRACE, a monocular one-stage method with a holistic 5D representation that enables the network to explicitly reason about human motion across time. Unlike previous methods [45,46] (b) that regress all 3D people from each frame in camera coordinates, TRACE (c) tracks the subjects (a) presented in the first frame through time and recovers their global trajectories in global coordinates in one shot.
This paper considers the adaptation of semantic segmentation from the synthetic source domain to the real target domain. Different from most previous explorations that often aim at developing adversarialbased domain alignment solutions, we tackle this challenging task from a new perspective, i.e., content-consistent matching (CCM). The target of CCM is to acquire those synthetic images that share similar distribution with the real ones in the target domain, so that the domain gap can be naturally alleviated by employing the content-consistent synthetic images for training. To be specific, we facilitate the CCM from two aspects, i.e., semantic layout matching and pixel-wise similarity matching. First, we use all the synthetic images from the source domain to train an initial segmentation model, which is then employed to produce coarse pixel-level labels for the unlabeled images in the target domain. With the coarse/accurate label maps for real/synthetic images, we construct their semantic layout matrixes from both horizontal and vertical directions and perform the matrixes matching to find out the synthetic images with similar semantic layout to real images. Second, we choose those predicted labels with high confidence to generate feature embeddings for all classes in the target domain, and further perform the pixel-wise matching on the mined layout-consistent synthetic images to harvest the appearance-consistent pixels. With the proposed CCM, only those content-consistent synthetic images are taken into account for learning the segmentation model, which can effectively alleviate the domain bias caused by those content-irrelevant synthetic images. Extensive experiments are conducted on two popular domain adaptation tasks, i.e., GTA5− →Cityscapes and SYNTHIA− →Cityscapes. Our CCM yields consistent improvements over the baselines and performs favorably against previous state-of-the-arts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.