2021
DOI: 10.1002/asi.24459
|View full text |Cite
|
Sign up to set email alerts
|

Ethnicity‐based name partitioning for author name disambiguation using supervised machine learning

Abstract: In several author name disambiguation studies, some ethnic name groups such as East Asian names are reported to be more difficult to disambiguate than others. This implies that disambiguation approaches might be improved if ethnic name groups are distinguished before disambiguation. We explore the potential of ethnic name partitioning by comparing performance of four machine learning algorithms trained and tested on the entire data or specifically on individual name groups. Results show that ethnicity‐based na… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 45 publications
0
4
0
Order By: Relevance
“…However, the consideration of individual publication histories at the level of authors or gender analysis requires name disambiguation. Obviously, this task is a big challenge, especially when dealing with non-Western names (Gomide et al, 2017;Kim et al, 2021;Treeratpituk & Giles, 2012). Since the majority of publications in the Ukrainian Economic discipline are related to local authors (Mryglod et al, 2021) it is natural to find mainly Ukrainian first and last names in our data set.…”
Section: Research Papermentioning
confidence: 99%
“…However, the consideration of individual publication histories at the level of authors or gender analysis requires name disambiguation. Obviously, this task is a big challenge, especially when dealing with non-Western names (Gomide et al, 2017;Kim et al, 2021;Treeratpituk & Giles, 2012). Since the majority of publications in the Ukrainian Economic discipline are related to local authors (Mryglod et al, 2021) it is natural to find mainly Ukrainian first and last names in our data set.…”
Section: Research Papermentioning
confidence: 99%
“…The work on name separation using machine learning is much more limited than the work for ethnicity classification using machine learning. Kim et al [25] tackled the problem of distinguishing authors who have the same names (i.e., same name means same first forename initial and full surname) in bibliographic. They used ethnicity information for better disambiguation performance.…”
Section: Ethnicity Classification and Name Separationmentioning
confidence: 99%
“…Specifically, in WhoisWho, most last names (53 out of 65) are Chinese last names, which is inconsistent with the fact that the authors are from all over the world. Note that this issue is nontrivial because many studies have confirmed that different ethnicities have different levels of ambiguities (Louppe et al, 2016), and, based on this idea, some ethnicity‐based disambiguation methods have been successfully developed (Kim et al, 2021; Louppe et al, 2016; Subramanian et al, 2021). Name variation is another frequently ignored aspect in building AND datasets.…”
Section: Related Workmentioning
confidence: 99%