2018
DOI: 10.1007/s11192-018-2824-5
|View full text |Cite|
|
Sign up to set email alerts
|

Evaluating author name disambiguation for digital libraries: a case of DBLP

Abstract: Author name ambiguity in a digital library may affect the findings of research that mines authorship data of the library. This study evaluates author name disambiguation in DBLP, a widely used but insufficiently evaluated digital library for its disambiguation performance. In doing so, this study takes a triangulation approach that author name disambiguation for a digital library can be better evaluated when its performance is assessed on multiple labeled datasets with comparison to baselines. Tested on three … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
72
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6

Relationship

4
2

Authors

Journals

citations
Cited by 52 publications
(72 citation statements)
references
References 44 publications
0
72
0
Order By: Relevance
“…The DBLP name disambiguation showed a good performance against a sample of authors with most ambiguous names (Kim, 2018;Kim & Diesner, 2015). However, it surely has disambiguation errors due to faulty merging or splitting of unique identities (for details, see Kim (2018)), which may affect the outcomes of this study. Thus, the findings of this study should be understood to represent only the given data set as it is.…”
Section: Conclusion and Discussionmentioning
confidence: 89%
See 2 more Smart Citations
“…The DBLP name disambiguation showed a good performance against a sample of authors with most ambiguous names (Kim, 2018;Kim & Diesner, 2015). However, it surely has disambiguation errors due to faulty merging or splitting of unique identities (for details, see Kim (2018)), which may affect the outcomes of this study. Thus, the findings of this study should be understood to represent only the given data set as it is.…”
Section: Conclusion and Discussionmentioning
confidence: 89%
“…DBLP data have been analyzed in numerous studies for name disambiguation, collaboration mapping, and data management (for instance, Cavero et al, 2014;Franceschet, 2011;Kim & Diesner, 2017;Shi et al, 2011). Recently, the accuracy of DBLP author name disambiguation was evaluated on multiple labeled data sets (Kim, 2018;Kim & Diesner, 2015). The DBLP disambiguation was highly accurate and performed better than other algorithmic disambiguation techniques except on some homonym cases (Kim, 2018).…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Per feature matching and clustering is used to generate the training/evaluation data. Experiments are done on the Web of Science data set and the performances are compared [30]. They used author names, coauthor names, titles, and venue for similarity calculation between new records and the retrieved block of records.…”
Section: Related Workmentioning
confidence: 99%
“…In addition, it seems that disambiguation design for DBLP and disambiguated MELINE (i.e., Author-ity) aimed at less merging (≈ high precision) than less splitting (≈ high recall) because merging is more detrimental to bibliometrics and network analysis than splitting(Fegley & Torvik, 2013;Kim, 2018;Müller et al, 2017). 9 This high splitting may be a result of an algorithmic decision by the MAG data team who clarified that, for an academic release purpose, a basic level of disambiguation was conducted for MAG.…”
mentioning
confidence: 99%