2020
DOI: 10.3390/e22040416
|View full text |Cite
|
Sign up to set email alerts
|

A Graph-Based Author Name Disambiguation Method and Analysis via Information Theory

Abstract: Name ambiguity, due to the fact that many people share an identical name, often deteriorates the performance of information integration, document retrieval and web search. In academic data analysis, author name ambiguity usually decreases the analysis performance. To solve this problem, an author name disambiguation task is designed to divide documents related to an author name reference into several parts and each part is associated with a real-life person. Existing methods usually use either attributes of do… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 27 publications
0
7
0
Order By: Relevance
“…Peng et al [16] improved [15], increase the information types, and propose a semi-supervised clustering algorithm. The Graph Auto Encoder, as a classical unsupervised feature extraction algorithm, was also used to extract publication features by researchers [7], and they used the HAC algorithm in clustering. However, it could not mine a deep relationship between publications because the homogeneous graph can not preserve a complex relationship.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Peng et al [16] improved [15], increase the information types, and propose a semi-supervised clustering algorithm. The Graph Auto Encoder, as a classical unsupervised feature extraction algorithm, was also used to extract publication features by researchers [7], and they used the HAC algorithm in clustering. However, it could not mine a deep relationship between publications because the homogeneous graph can not preserve a complex relationship.…”
Section: Related Workmentioning
confidence: 99%
“…The second challenge is determining the number of distinct authors with the same name, that is, to determine the size of clusters. Since common clustering methods that require a specified size of clusters cannot be applied to this task, many methods similar to hierarchical clustering and AP clustering are used [3,7]. However, these clustering algorithms also require specifying the relevant parameters that control the size of clusters.…”
Section: Introductionmentioning
confidence: 99%
“…In Shin et al (2014), another framework is constructed based on graph operations such as vertex splitting and vertex merging of co-authorship graphs and shown to mostly outperform three existing unsupervised methods on various standard evaluation metrics. A recent addition to the literature on unsupervised methods, (Ma et al 2020), claims to offer a superior performance than GHOST and other state-of-the-art methods by employing representation learning, evaluated on AMiner, which integrates data from popular databases with several academic bibliographic collections. They use word2vec for document representation, followed by a Graph Auto-Encoder.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, they are applicable to any database and domain, as these attributes are always available, at least to a certain degree of completeness. Almost all approaches, possibly with the notable exception of Ma et al (2020), assume that co-authors are the most important information for author disambiguation. A peculiarity of astronomy in this regard is a generally larger average number of authors per article than in other disciplines, due to the existence of populous collaborations, often comprising hundreds of members.…”
Section: Related Workmentioning
confidence: 99%
“…Although NDA could be used to assist in NED tasks, NED typically strongly relies on the text, e.g., by characterizing the context in which the named entity occurs (e.g., paper topic) [14]. Similarly, Ma et al [15] proposes a name disambiguation model based on representation learning employing attributes and network connections, by first encoding the attributes of each paper using variational graph auto-encoder, then computing a similarity metric from the relationship of these attributes, and then using graph embedding to leverage the author relationships, heavily relying on NLP.…”
Section: Related Workmentioning
confidence: 99%