Entity alignment (EA) finds equivalent entities that are located in different knowledge graphs (KGs), which is an essential step to enhance the quality of KGs, and hence of significance to downstream applications (e.g., question answering and recommendation). Recent years have witnessed a rapid increase of EA approaches, yet the relative performance of them remains unclear, partly due to the incomplete empirical evaluations, as well as the fact that comparisons were carried out under different settings (i.e., datasets, information used as input, etc.). In this paper, we fill in the gap by conducting a comprehensive evaluation and detailed analysis of state-of-the-art EA approaches. We first propose a general EA framework that encompasses all the current methods, and then group existing methods into three major categories. Next, we judiciously evaluate these solutions on a wide range of use cases, based on their effectiveness, efficiency and robustness. Finally, we construct a new EA dataset to mirror the real-life challenges of alignment, which were largely overlooked by existing literature. This study strives to provide a clear picture of the strengths and weaknesses of current EA approaches, so as to inspire quality follow-up research. ! 1. As where we are standing, EA can be deemed as a special case of entity resolution (ER), which recalls a pile of literature (to be discussed in Section 2.2). Thus, some ER methods (with minor adaptation to handle EA) are also involved in this study to ensure the comprehensiveness of the research.this article, we provide an empirical evaluation of state-of-the-art EA approaches with the following features: (1) Fair comparison within and across categories. Almost all recent studies [5], [24], [38], [55], [60], [61], [62], [63],[67] are confined to comparing with only a subset of methods. In addition, different approaches follow different settings: some merely use the KG structure for alignment, while others also utilize additional information; some align KGs in one pass, while others employ an iterative (re-)training strategy. Although a direct comparison of these methods, as reported in the literature, demonstrates the overall effectiveness of the solutions, a more preferable and fairer practice would be to group these methods into categories and then compare the results both within and across categories.In this study, we include most state-of-the-art methods for lateral comparison, including those very recent efforts that have not yet been compared with others before. By dividing them into three groups and conducting detailed analysis on both intraand inter-group evaluations, we are able to better position these approaches and assess their effectiveness.(2) Comprehensive evaluation on representative datasets. To evaluate the performance of EA systems, several datasets have been constructed, which can be broadly categorized into cross-lingual benchmarks, represented by DBP15K [53], and mono-lingual benchmarks, represented by DWY100K [54]. A very recent study [24] points out th...