3D scene graph generation (SGG) aims to predict the class of objects and predicates simultaneously in one 3D point cloud scene with instance segmentation. Since the underlying semantic of 3D point clouds is spatial information, recent ideas of the 3D SGG task usually face difficulties in understanding global contextual semantic relationships and neglect the intrinsic 3D visual structures. To build the global scope of semantic relationships, we first propose two types of Semantic Clue (SC) from entity level and path level, respectively. SC can be extracted from the training set and modeled as the co‐occurrence probability between entities. Then a novel Semantic Clue aware Graph Convolution Network (SC‐GCN) is designed to explicitly model each SC of which the message is passed in their specific neighbor pattern. For constructing the interactions between the 3D visual and semantic modalities, a visual‐language transformer (VLT) module is proposed to jointly learn the correlation between 3D visual features and class label embeddings. Systematic experiments on the 3D semantic scene graph (3DSSG) dataset show that our full method achieves state‐of‐the‐art performance.
Unsupervised cross-domain counting research using synthetic datasets becomes imminent when considering the laborious labeling for supervised methods. However, the existing methods only focus on learning domain shared knowledge to narrow the gap between the source domain and target domain (inter-domain gap). Nevertheless, these methods do not consider the enormous distribution gap among the target domain data itself (intra-domain gap). In this paper, we propose a two-step domain adaptation method with multi-level feature response branches, which further uses the intra-domain knowledge to strengthen the target domain's adaptability. Specifically, we first use different feature response branches to learn inter-domain knowledge more robustly, reducing the prediction inconsistency of different scenarios. Subsequently, the trained model is used to generate pseudo-labels for the target domain. The entire model was retrained by using pseudo-labels. Various experiments on synthetic dataset GCC and three real public datasets validate our proposed method's availability with higher accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.