Entity alignment (EA) identifies entities that refer to the same real-world object but locate in different knowledge graphs (KGs), and has been harnessed for KG construction and integration. When generating EA results, current embeddingbased solutions treat entities independently and fail to take into account the interdependence between entities. In addition, most of embedding-based EA methods either fuse different features on representation-level and generate unified entity embedding for alignment, which potentially causes information loss, or aggregate features on outcome-level with hand-tuned weights, which is not practical with increasing numbers of features.To tackle these deficiencies, we propose a collective embeddingbased EA framework with adaptive feature fusion mechanism. We first employ three representative features, i.e., structural, semantic and string signals, for capturing different aspects of the similarity between entities in heterogeneous KGs. These features are then integrated at outcome-level, with dynamically assigned weights generated by our carefully devised adaptive feature fusion strategy. Eventually, in order to make collective EA decisions, we formulate EA as the classical stable matching problem between entities to be aligned, with preference lists constructed using fused feature matrix. It is further effectively solved by deferred acceptance algorithm. Our proposal is evaluated on both cross-lingual and mono-lingual EA benchmarks against state-ofthe-art solutions, and the empirical results verify its effectiveness and superiority. We also perform ablation study to gain insights into framework modules.
Entity alignment (EA) finds equivalent entities that are located in different knowledge graphs (KGs), which is an essential step to enhance the quality of KGs, and hence of significance to downstream applications (e.g., question answering and recommendation). Recent years have witnessed a rapid increase of EA approaches, yet the relative performance of them remains unclear, partly due to the incomplete empirical evaluations, as well as the fact that comparisons were carried out under different settings (i.e., datasets, information used as input, etc.). In this paper, we fill in the gap by conducting a comprehensive evaluation and detailed analysis of state-of-the-art EA approaches. We first propose a general EA framework that encompasses all the current methods, and then group existing methods into three major categories. Next, we judiciously evaluate these solutions on a wide range of use cases, based on their effectiveness, efficiency and robustness. Finally, we construct a new EA dataset to mirror the real-life challenges of alignment, which were largely overlooked by existing literature. This study strives to provide a clear picture of the strengths and weaknesses of current EA approaches, so as to inspire quality follow-up research. ! 1. As where we are standing, EA can be deemed as a special case of entity resolution (ER), which recalls a pile of literature (to be discussed in Section 2.2). Thus, some ER methods (with minor adaptation to handle EA) are also involved in this study to ensure the comprehensiveness of the research.this article, we provide an empirical evaluation of state-of-the-art EA approaches with the following features: (1) Fair comparison within and across categories. Almost all recent studies [5], [24], [38], [55], [60], [61], [62], [63],[67] are confined to comparing with only a subset of methods. In addition, different approaches follow different settings: some merely use the KG structure for alignment, while others also utilize additional information; some align KGs in one pass, while others employ an iterative (re-)training strategy. Although a direct comparison of these methods, as reported in the literature, demonstrates the overall effectiveness of the solutions, a more preferable and fairer practice would be to group these methods into categories and then compare the results both within and across categories.In this study, we include most state-of-the-art methods for lateral comparison, including those very recent efforts that have not yet been compared with others before. By dividing them into three groups and conducting detailed analysis on both intraand inter-group evaluations, we are able to better position these approaches and assess their effectiveness.(2) Comprehensive evaluation on representative datasets. To evaluate the performance of EA systems, several datasets have been constructed, which can be broadly categorized into cross-lingual benchmarks, represented by DBP15K [53], and mono-lingual benchmarks, represented by DWY100K [54]. A very recent study [24] points out th...
Entity alignment (EA) is to discover equivalent entities in knowledge graphs (KGs), which bridges heterogeneous sources of information and facilitates the integration of knowledge. Existing EA solutions mainly rely on structural information to align entities, typically through KG embedding. Nonetheless, in real-life KGs, only a few entities are densely connected to others, and the rest majority possess rather sparse neighborhood structure. We refer to the la er as long-tail entities, and observe that such phenomenon arguably limits the use of structural information for EA.To mitigate the issue, we revisit and investigate into the conventional EA pipeline in pursuit of elegant performance. For prealignment, we propose to amplify long-tail entities, which are of relatively weak structural information, with entity name information that is generally available (but overlooked) in the form of concatenated power mean word embeddings. For alignment, under a novel complementary framework of consolidating structural and name signals, we identify entity's degree as important guidance to e ectively fuse two di erent sources of information. To this end, a degree-aware co-a ention network is conceived, which dynamically adjusts the signi cance of features in a degree-aware manner. For post-alignment, we propose to complement original KGs with facts from their counterparts by using con dent EA results as anchors via iterative training. Comprehensive experimental evaluations validate the superiority of our proposed techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.