2019
DOI: 10.1002/asi.24212
|View full text |Cite
|
Sign up to set email alerts
|

A Graph Combination With Edge Pruning‐Based Approach for Author Name Disambiguation

Abstract: Author name disambiguation (AND) is a challenging problem due to several issues such as missing key identifiers, same name corresponding to multiple authors, along with inconsistent representation. Several techniques have been proposed but maintaining consistent accuracy levels over all data sets is still a major challenge. We identify two major issues associated with the AND problem. First, the namesake problem in which two or more authors with the same name publishes in a similar domain. Second, the diverse … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(3 citation statements)
references
References 39 publications
0
3
0
Order By: Relevance
“…In Km et al (2020), the authors present a graph-based approach where two graphs are combined together, a person-person graph obtained by connecting papers with shared coauthors, and a document-document graph which models similarity between publications' content. The document-document graph is obtained by first modeling abstract keywords with TF-IDF vectors and by drawing an edge between two nodes of the graphs when their similarity is higher than a selected threshold; subsequently, this graph is pruned by removing connections between papers whose shared referenced works are below a certain threshold.…”
Section: Graph-based Approachesmentioning
confidence: 99%
“…In Km et al (2020), the authors present a graph-based approach where two graphs are combined together, a person-person graph obtained by connecting papers with shared coauthors, and a document-document graph which models similarity between publications' content. The document-document graph is obtained by first modeling abstract keywords with TF-IDF vectors and by drawing an edge between two nodes of the graphs when their similarity is higher than a selected threshold; subsequently, this graph is pruned by removing connections between papers whose shared referenced works are below a certain threshold.…”
Section: Graph-based Approachesmentioning
confidence: 99%
“…Some errors discussed in the literature during the last years range from problems in transcribing large document collections [1] to namesake alias, homonymy or polysemy (when the same name corresponds to multiple authors), and name variability or synonymy (when an author appears under different names) [8]. Other common issues reported include missing identifiers, lack of standardized schemas, and inconsistencies in data representation [1,9].…”
Section: Related Work and Backgroundmentioning
confidence: 99%
“…GHOST [ 6 ] utilizes a coauthorship graph to compute the similarity between node pairs, utilizing the attribute of coauthor only could achieve the same performance as previous complicated approach. A combined graph encompassing the author-author graph and document-document graph is put forward by Pooja [ 22 ]. Each connected component of the combined graph represents a distinct cluster.…”
Section: Related Workmentioning
confidence: 99%