Joint extraction from unstructured text aims to extract relational triples composed of entity pairs and their relations. However, most existing works fail to process the overlapping issues that occur when the same entities are utilized to generate different relational triples in a sentence. In this work, we propose a mutually exclusive Binary Cross Tagging (BCT) scheme and develop the end-to-end BCT framework to jointly extract overlapping entities and triples. Each token of entities is assigned a mutually exclusive binary tag, and then these tags are cross-matched in all tag sequences to form triples. Our method is compared with other state-of-the-art models in two English public datasets and a large-scale Chinese dataset. Experiments show that our proposed framework achieves encouraging performance in F1 scores for the three datasets investigated. Further detailed analysis demonstrates that our method achieves strong performance overall with three overlapping patterns, especially when the overlapping problem becomes complex.
Visual relationship detection (VRD) aims to locate objects and recognize their pairwise relationships for parsing scene graphs. To enable a higher understanding of the visual scene, we propose a symmetric fusion learning model for visual relationship detection and scene graph parsing. We integrate objects and relationship features at visual and semantic levels for better relations feature mapping. First, we apply a feature fusion for the construction of the visual module and introduce a semantic representation learning module combined with large-scale external knowledge. We minimize the loss by matching the visual and semantic embeddings using our designed symmetric learning module. The symmetric learning module based on reverse cross-entropy can boost cross-entropy symmetrically and perform reverse supervision for inaccurate annotations. Our model is compared with other state-of-the-art methods in two public data sets. Experiments show that our proposed model achieves encouraging performance in various metrics for the two data sets investigated. The further detailed analysis demonstrates that the proposed method performs better by partially alleviating the impact of inaccurate annotations.
Construction hazards occur at any time in outfield test sites and frequently result from improper interactions between objects. The majority of casualties might be avoided by following on-site regulations. However, workers may be unable to comply with the safety regulations fully because of stress, fatigue, or negligence. The development of deep-learning-based computer vision and on-site video surveillance facilitates safety inspections, but automatic hazard identification is often limited due to the semantic gap. This paper proposes an automatic hazard identification method that integrates on-site scene graph generation and domain-specific knowledge extraction. A BERT-based information extraction model is presented to automatically extract the key regulatory information from outfield work safety requirements. Subsequently, an on-site scene parsing model is introduced for detecting interaction between objects in images. An automatic safety checking approach is also established to perform PPE compliance checks by integrating detected textual and visual relational information. Experimental results show that our proposed method achieves strong performance in various metrics on self-built and widely used public datasets. The proposed method can precisely extract relational information from visual and text modalities to facilitate on-site hazard identification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.