2022
DOI: 10.1007/s11192-022-04480-w
|View full text |Cite
|
Sign up to set email alerts
|

Algorithmic identification of Ph.D. thesis-related publications: a proof-of-concept study

Abstract: In this study we propose and evaluate a method to automatically identify the journal publications that are related to a Ph.D. thesis using bibliographical data of both items. We build a manually curated ground truth dataset from German cumulative doctoral theses that explicitly list the included publications, which we match with records in the Scopus database. We then test supervised classification methods on the task of identifying the correct associated publications among high numbers of potential candidates… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…Additionally, automatic, or alternative formal matching algorithms between publications and dissertations could be used to cover more research fields, languages or countries (e.g. Donner, 2022;Echeverria et al, 2015;Heinisch & Buenstorf, 2018). However, based on the results of the study, we can conclude that a policy that allows doctoral students to write cumulative dissertations permits them to strengthen their research output counted as papers published or cited.…”
Section: Discussionmentioning
confidence: 98%
“…Additionally, automatic, or alternative formal matching algorithms between publications and dissertations could be used to cover more research fields, languages or countries (e.g. Donner, 2022;Echeverria et al, 2015;Heinisch & Buenstorf, 2018). However, based on the results of the study, we can conclude that a policy that allows doctoral students to write cumulative dissertations permits them to strengthen their research output counted as papers published or cited.…”
Section: Discussionmentioning
confidence: 98%
“…One of the more exciting recent developments in empirical social science research is the increasing availability of large administrative databases and the ability to link across them to generate new insights. Indeed linked datasets have allowed researchers working in descriptive, predictive, and causal modalities to generate systematic inferences about large numbers of individuals [1][2][3][4][5][6][7][8][9]. Unfortunately, most administrative datasets are not designed to be linked to others and thus have no common and reliable unique identifiers.…”
Section: Introductionmentioning
confidence: 99%